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Editor’s Preface 


In his dissertation, Philipp Kolo examines the measurement and econometric effects of 
ethnic diversity. This issue is of great relevance to research and policy and is currently 
being discussed a great deal in the literature. In particular, a sizable literature has sug- 
gested that ethnic diversity constitutes a significant barrier to economic development. The 
precise measurement and interpretation of these results are a matter of substantial contro- 
versy. This book makes significant contributions to these debates. First, the dynamics of 
ethnic diversity are being empirically analyzed for the first time. Second, it develops and 
applies a new measure of ethnic diversity which takes the distance between groups into ac- 
count, thus focusing on diversity rather than mere fragmentation. Mr. Kolo convincingly 
confronts theoretical considerations with (new) data and thereby provides a good mix of 
theory and empirics and valuable input to this field. These two aspects are new to this 
extremely diverse area of literature and Mr. Kolo shows that he is well-aware of recent 
developments in the field and is able to significantly contribute to it. 

Chapter 1 provides the theoretical basis for the following empirical chapter 2, present- 
ing the first substantial analysis. Here the development of ethnic diversity over time is 
explained within a model framework. Above all, the influences of education, development, 
trade and immigration are theoretically examined, illustrating how these factors can have 


an influence on the development of ethnic diversity. 


In the second chapter, the level of ethnic diversity and its trends is empirically ana- 
lyzed. Initially, the factors influencing ethnic diversity are derived from the literature and 
regressions are then run. The results show that there is a "base level’ of heterogeneity, 
determined by geography and evolutionary factors. Additionally, it is found that the na- 
ture of colonization has a particularly strong influence, while urbanization, education and 
immigration are the most influential factors regarding changes in ethnic fractionalization 
over time. Showing the dynamics of ethnic fractionalization empirically is a major contri- 
bution of this dissertation. The results here are based on the data on diversity that Mr. 
Kolo has discovered over the last two years and these will certainly be received with great 


interest. 


In the third chapter, a new measure of ethnic diversity is then generated, which, 
as mentioned, takes the distance between groups into account. The so-called distance 


adjusted ethno-linguistic fractionalization index (DELF) builds on an impressive amount 
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of new data to address this issue. Mr. Kolo calculates three indices of religious, ethno- 
racial, and linguistic diversity, and an overall index based on these three components. The 
main analysis weights the three components equally. However, the appendix reports a 
substantial amount of detail on different possible weighting schemes, showing the results 
to be robust. Again, this is very well derived and almost solely based on new data. In 
turn, yet another important desideratum is tackled in the literature. 

Finally, in the last chapter, this measure is employed in order to replicate a number 
of different analyses from the literature. In particular, the influence of ethnic diversity 
on conflict, growth, trust, trade and the mutual opinions of different populations towards 
their counterparts are applied. In these cases, it is shown that this measure portrays just 
as well, and sometimes even better effects. A genuine contribution to the literature is 
also achieved here, and it is impressive to see how many studies are replicated and further 
enriched through this new measure. 

Altogether this thesis provides a highly interesting and sophisticated theoretical as 
well as empirical evaluation of the measurement, determinants, and consequences of eth- 
nic fragmentation and diversity. The fact that all four pieces break new ground in terms 
of methodical and empirical analysis is particularly commendable, and with this, Philipp 


Kolo has succeeded in providing several important contributions to the literature. 


Prof. Stephan Klasen (Ph.D.) 
Göttingen, April 2012 
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Chapter 0 


Introduction and Overview 


Sed Angelus est melior quam lapis. Ergo duo Angeli sunt aliquid melius 
quam Angelus et lapis. (...) Quod quamvis Angelus absolute sit melior quam 
lapis, tamen utraque natura est melior quam altera tantum: et ideo melius est 


universum in quo sunt Angeli et aliae res, quam ubi essent Angeli tantum. 


Thomas D'Aquinas - Scriptum super Sententiarum! 


The valuation of two different things and assigning a personal hierarchy to them is often 
feasible. The valuation of a combination of these things is, however, more complicated. 
Not only the values of the single objects are important; the specific combination of these 
(dis)similar elements is also essential and the reason why any valuation cannot be a simple 
addition of its elements. This fundamental concept is well illustrated by the opening 
citation by Thomas D'Aquinas some 750 years ago, and must have been the essential 
considerations of Noah when he boarded his ark. The quantity of any single species 
was of less importance than having the highest possible diversity. In 1992, more than 
150 countries ratified the Rio Convention, aiming towards the “conservation of biological 
diversity"? Furthermore, at the end of 2010 the United Nations General Assembly declared 
the decade 2011-2020 would be the ‘United Nations Decade on Biodiversity’. Despite all 
efforts towards, and challenges of safeguarding biodiversity there is at least a common 
understanding that this diversity is something exceptional and deserves to be protected 
and conserved.? 

When writing his essay, Thomas D'Aquinas certainly did not exclusively refer to the 


diversity of animals and plants, but to the different natures of human kind. So, what is it 


l«Since an angel is better than a stone, therefore two angels are better than one angel and a stone. (...) 
Although an angel, considered absolutely, is better than a stone, nevertheless two natures are better than 
one only; and therefore a universe containing angles and other things is better than one containing angels 
only" - Thomas d'Aquinas, Scriptum super Sententiarum, lib. 1 d. 44 q. 1 a. 2, 6 and lib. 1 d. 44 q. 1 a. 
2, ad 6 (D'Aquinas, 1873, Vol. VII, p.527-528). Translation taken from (Lovejoy, 1957, p.77). 

? Article 1 of the Convention on Biological Diversity (http:/ /www.cbd.int/convention/text/). 

? Besides biodiversity's instrumental value, one ascribes a high intrinsic value to it. Its instrumental 
value, for example, arises from its potential agricultural or pharmaceutical applications. In contrast, the 
intrinsic value of biodiversity originates from its mere existence. 
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about their (ethnic) diversity? Is the agglomeration of different cultural, religious or lan- 
guage groups just as unequivocally seen as something exceptional, deserving of protection 
and conservation? Dalby (2003) believes that 2,500 languages are likely to be lost over 
this century. With less than 7,000 living languages in the world listed by the Ethnologue 
project (Lewis, 2009), this heavily impacts the diversity of global languages. However, to 
assess the values of any of these lost languages is equally hard to assess as the loss of any 
species to biodiversity.* So, is ethnic diversity as threatened as biodiversity?? 

Until the 20th century, the ethnic composition of countries was more associated with 
established nation states. In this regard, ethnicity was more a unifying factor than one 
that posed any threat of conflict. Over the course of history, however, the concepts of 
nation states and ethnic diversity became diametrical ones. Since then, there have been 
many negative, despotic, nationalistic eras in history, but also constant positive examples 
of coexistence. These alleged opposing extremes culminated when Huntington (1993) 
proclaimed the ‘clashes of civilizations’. In his view of a post Cold War era, the ideology 
driven conflict of that time is replaced by cultural and religious clashes between global 
civilizations. The rather random division of the world into eight civilizations on whose 
borders conflicts are supposed to arise, has drawn a lot of criticism.® 

Having eight civilizations is indeed a very superficial classification that fails to take 
the ethnic setup and internal dynamics within these civilizations into account. What’s 
more, not only are these civilizations diverse, but also the countries within them, which 
all differ in their levels of diversity. Increased mobility, economically and socially, has 


fueled ethnic diversity, for example, in Europe. If these dynamics stretch the European 


4 Admittedly, the extinction of languages, even major ones, is anything but new. Latin, the language of 
the Roman Empire, is one of the most prominent examples. On the contrary, the evolution of languages 
also created new ones. The Romance languages that evolved from the common Latin origin and various 
Creole languages, through mixing with the languages of colonizers, are such examples. If one does not 
assign, for example, language diversity any intrinsic value, the disappearance of a language is just the 
result of its instrumental value dropping insofar as it no longer fulfills its speaker’s socioeconomic needs 
(Mufwene, 2005). 

>For an approach to reconcile biodiversity and cultural diversity, see Loh and Harmon (2005). They 
construct a combined biocultural diversity index. Equally, Evers et al. (2010) apply methods of biodiversity 
research on analyses of Malaysia’s multicultural society. 

6The eight civilizations are meant to be the “Western, Confucian, Japanese, Islamic, Hindu, Slavic- 
Orthodox, Latin American and possibly African civilization” (Huntington, 1993, p. 25). Although the 
strong differences between civilizations are the key motivation for his claim, it lacks a consistent logic 
explaining the reason behind selecting exactly these eight civilizations. The fact that the number of distinct 
civilizations is not even clearly defined is covered well in the versed critique of Tipson (1997). Additionally, 
here are several other lines of critique. Whereas Huntington (1993) gives the idea that democracy was, 
and still is a unique Western value, Sen (1999) refers to the significantly different democratic traditions 
between Western countries and democratic traditions found in other regions of the world historically. The 
categorization of roots of conflicts is another line of critique. Huntington (1993) claims that before the 
French Revolution, conflicts were between princes and emperors over influence and territory. The period 
ollowing this is exemplified by the fight between people and nations, until the root of confrontation was 
replaced by ideology after both World Wars. This simplification fails to cover earlier cultural or ideological 
conflicts during the Reformation, the Thirty Years War, or the period of Enlightenment. A final line of 
critique is that ethnic grouping may also arise only due to ideological mobilization by elites contending 
or political influence and unfulfilled socioeconomic needs, especially within countries. This will be briefly 
discussed in Chapter 1. 
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community too far, or if they are even the base for Europe’s future success, remains de- 
batable. This dissertation will go about improving the understanding of ethnicity and its 
potential implications. However, to not lose oneself in equally arbitrary or vague charac- 
terizations, a clear as possible definition of the main concepts, ethnicity and diversity, as 
well as the extent of the economic implications on whose backdrop both will be examined, 


seems necessary. 


Ethnicity Economists mostly fail to give a more thorough definition of ethnicity, and 
there is wide agreement that it is a “rather vague and amorphous concept” (Alesina et al., 
2003, p. 160). Although a clear cut definition is indeed difficult, at least a common 
understanding is crucial. The Encyclopedia Britannica defines an ethnic group as “a 
social group or category of the population that (...) is set apart and bound together by 
common ties of race, language, nationality, or culture.” (Encyclopedia Britannica, 2007, 
Vol. IV, p. 582). Thus, these groups need to be distinguishable from each other along a 


defined characteristic. According to the above definition, these are mainly:" 


e Language 
Language is a fundamental mechanism through which people create social life and 
the means for any interaction. Anyone who has ever learnt another language is 
conscious of their differences. For an Italian, it is in generally easier to learn Spanish 
than Japanese, for example. Thus, language is probably the clearest characteristic 


and their differences rather well defined. 


e Hace 
The racial part of the definition inherits some biological classification. It may be 
described as a population with a common ancestry and shared habits that represent 
a common genetic pool (Barrett et al., 2001, Vol. IT). These physical characteristics 
need to be understood in light of evolutionary processes as an adaptation to different 


environments and should not be confused with any racist categorization.? 


e Culture 
The aspect of culture is probably the least clear due to the ambiguous nature of its 
roots and the fact that it is mutually influenced by the previous aspects. Culture is 
supposed to consist of “languages, ideas, beliefs, customs, taboos, codes, institutions, 
tools, techniques, works of art, rituals, ceremonies and other related components" 
(Encyclopeedia Britannica, 2007, Vol. III, p. 784). Beliefs and all forms of religion 
influence many components of this definition. Thus, it seems obvious to include 


religion as an important pillar of one's culture.? As there is also strong interplay 


"Many sources use very comparable sets of characteristics. See, for example, Barrett et al. (2001), 
Alesina et al. (2003), Okediji (2005) or de Groot (2009). 
In this regard, skin pigmentation is a good example. For the above characterization, it is seen as 
reflecting an adaptation to geographic particularities, i.e., in reaction to different intensities of UV light. 
This not only includes the main ‘institutionalized’ religions, but all animist- and ethno-religions that 
existed long before the religions we know of today. 
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between languages and cultural behavior, ethnolinguists have termed groups defined 


along this double identification as ethno-linguistic groups. 


e Nationality 
Nationality originates from the historic unity idea of a national state. This charac- 
teristic is less important for describing modern, ethnically diverse countries and is 
probably the one which is the easiest to change. One does not necessarily need to 


learn a new language or adapt to a new culture to obtain a passport. 


These aspects defining an ethnic group, according to the Encyclopedia Britannica, are 
obviously not clear cut concepts, often overlapping and depending especially on the sel 
assignment of its individuals. It is important, however, for any social or economic 
interpretation, that these aspects may be observed by individuals, and are used to deter- 
mine any form of ‘otherness’ between two individuals or groups. Only if this (socially 
constructed ‘otherness’ has an impact on social interactions does it subsequently affect 


economic outcomes.!? Predominantly aligned with most of the economic literature, for 


the remainder of this dissertation, diversity will focus on the three ‘clearer’ aspects o 


language, race and religion.!? 


Diversity Although one generally speaks of an ethnically heterogeneous country, there 
are two very distinct concepts used in the economic literature for its measurement, these 
being ethnic fragmentation and ethnic diversity. The fragmentation or fractionalization 
of countries assesses the multitude of different groups. This is the most widely used index 
in the economic literature and was introduced by Taylor and Hudson (1972) as the ethno- 
inguistic fractionalization index (ELF).!4 It is based solely on the relative group sizes of 
groups defined along any of the above characteristics. 15 

Diversity, in contrast, is a more elaborate measure of ethnicity as it takes the dissimi- 
arities between groups into account. To assess a country's fragmentation, it is sufficient to 
know that there exists, say, two angels and one stone. To assess the diversity of this small 


country, one needs to assess distances between both groups that are “such an absolutely 


undamental concept in the measurement of dissimilarity that it must play an essential 


10A classic example is the multitude of words the Inuit culture has for snow, underlining the close 
relationship of both concepts (Encyclopædia Britannica, 2007, Vol. IV). 

H Besides the definition of what ethnicity comprises, why ethnic groups emerge and change are equally 
important questions. This will be addressed in more detail in Chapter 1 and 2. A related problem is 
the situational or contextual identification with one or the other group. For a distinction between ethnic 
structure (descent based attributes) and ethnic practice (activation of these attributes), see Chandra and 
Wilkinson (2008). This distinction is, however, not in the scope of this dissertation. 

With this important point, a difference between ethnic diversity and biodiversity becomes obvious. 

13 A more detailed definition of these three pillars as they are defined for the purpose of this dissertation, 
is found in Chapter 3. 

I The data source Taylor and Hudson (1972) based their first ELF on, the Atlas Narodov Mira (Bruk, 
1964), is mainly defined along ethno-linguistic criteria, which explains the name. However, the ELF is now 
also calculated based entirely on linguistic, ethnic or religious groups. 

15The mathematical attributes of all of the index calculations will be discussed in the respective chapters. 
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role in any meaningful theory of diversity or classification” (Weitzman, 1992, p. 365). To 
arrive at these distances, one needs to know more about the characteristics of both groups, 
which makes the clear definition of groups based on any of the above concepts even more 
important. The introduction of an index covering ethnicity’s diversity in contrast to its 
mere fragmentation is the main focus of Chapter 3.6 As countries can be heterogeneous 
in both ways, i.e., being fragmented or being diverse, heterogeneity is used as the general 
term for both. 


Fragmentation/Fractionalization OR Diversity 

— Only the relative group sizes in a — Additionally to the relative grou 
given country are taken into account sizes. the eem PEN those 

int t 5 

— Measures the probability that two ia Sia 
randomly drawn individuals do not — Measures the expected dissimilarity 
belong to the same group of two randomly drawn individuals 

— The associated and widely used — Despite first attempts no index is yet 
measure in the economic literature is commonly adopted in the economic 
the ethno-linguistic fractionalization literature 
index (ELF) 


Figure 0.1: Difference of fragmentation and diversity measure 


Economic implications Economists only started to engage in discussions surrounding 
the ‘Noah’s Ark Problem’ (Weitzman, 1992, 1998) in the 1990s. Today, a wide range of 
socioeconomic problems are supposed to be linked to a country’s ethnic heterogeneity.!7 
Mauro (1995) is considered to be the first to assess the role of ethnicity on economic 
outcomes empirically. He linked a higher level of ethnic fragmentation to higher levels of 
corruption. Soon after, Easterly and Levine (1997) believed the apparent higher ethnic 
fractionalization of Africa to be responsible for its ‘growth tragedy’. The focus on GDP per 
capita levels became subsequently one of the major strands of the literature.? Departing 
from the outcome, mirrored in higher income levels (GDP per capita), the focus moved to 
various socioeconomic factors that are supposed to affect different income levels. 

Alesina et al. (1999) showed that public goods provision is lower in ethnically more het- 
erogeneous communities, with communal participation equally being reduced (Alesina and 
La Ferrara, 2000). La Porta et al. (1999) and Alesina and Zhuravskaya (2011) document 
the negative impact ethnic heterogeneity has on the general quality of government. Thus, 
in general, higher ethnic diversity is associated with poorer institutions and governance. 


Bjernskov (2007, 2008) searches for a correlation between ethnically more fragmented 


16Despite these two major concepts, which will be the focus of this dissertation, an index of polarization 
has drawn more attention. This measure will be discussed in the essays whenever deemed necessary to 
offer a broader picture, or when it is of equal importance for specific questions. 
lTThese analyses rely almost entirely on the measure of ethno-linguistic fractionalization (ELF). 
18See, for example, Collier (1998), La Porta et al. (1999), Alesina et al. (2003), Alesina and La Ferrara 
(2005) or Garcia-Montalvo and Reynal-Querol (2005a). 
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Figure 0.2: Overview of ethnicity’s role in the economic context 


countries and its impact on the general level of trust. Equally, the level of redistri- 
bution is supposed to be lower for more heterogeneous countries (Desmet et al., 2009). 
Finally, Felbermayr and Toubal (2010) show a trade increasing effect for countries that 
are ethnically closer.?? Collier and Hoeffler (2002) initiated the second most prominent 
strand of literature by exploring the role ethnicity plays in the incidence, onset or dura- 
tion of conflicts, which was subsequently extended by Collier and Hoeffler (2004), Collier 
et al. (2009), and Garcia-Montalvo and Reynal-Querol (2002, 2005b, 2008, 2010). Ad- 
ditional conflict leads to even lower institutional quality. This is often compensated for 
by higher ethnic identification, making any interaction possible in the absence of codified 
laws and proper governance structures, starting a vicious cycle.?! A more salient ethnic 
identification leads, in line with the above literature, to worse economic performance und 


suboptimal institutional structures.?? 


Surprisingly, most results hint to a negative effect and thus document the societal costs 
of ethnic diversity. Only a few articles question this biased analysis of ethnic heterogeneity. 


Alesina and La Ferrara (2005), for example, posit that its impact may well be positive 


19 This is inspired by earlier work of Zak and Knack (2001). 

20This result, which is confirmed in Chapter 4, does not, however, mean that two countries need to be 
more homogeneous. The contrary may be the case. When sizeable diasporas are present in countries they 
may be internally more diverse. These groups, in turn, exhibit closer (ethnic) ties to their home countries, 
being one reason for an increased trade volume between these two countries. Thus, expelling an ethnic 
group may make a country internally more homogeneous but would reduce the ethnic ties to the expelled 
groups’ home country, limiting the trade volume. 

2!See, for example, Greif (1993) for historic examples of where kinship ties replaced codified laws and 
institutions, or Akerlof and Kranton (2000) on how identity associated with different (social) categories 
influences economic outcomes. 

??Based on these results, one would assume that every country would strive for higher homogenization 
or assimilation. This is not a necessary result, as demonstrated in Chapter 1 and 2. 
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but depends on the level of development in a country. Thus, does ethnic diversity 
only have positive effects for countries that can afford it? There might indeed be high 
societal costs in order to achieve proper communication, education, and higher quality of 
society's institutions in general, and thus to overcome all the documented evils of a more 
heterogeneous community. 

A better understanding of the roots of ethnicity, what drives its changes and a dif- 
ferent method of measurement might help to bridge the gap between the very different 


perceptions of biodiversity and cultural diversity. 


Theoretical 
foundation On the Dynamics of Ethnicity Measuring Ethnic Diversity 


Empirical 
foundation 


Figure 0.3: Structure of the dissertation 


Structure of the dissertation This dissertation consists of four distinct essays, each 
covered in a chapter of their own. Two essays mutually complement each other, thus 
forming two main parts. Both parts are based on a strong theoretical foundation and 
are subsequently empirically tested. The first strand adds new insights into a more pro- 
found understanding of ethnic fragmentation. It is extended by modeling its dynamics 
and a subsequent empirical analysis of the drivers for any change in a country's ethnic 
fragmentation. The second part offers a new index measuring the important aspect of 


ethnic diversity in contrast with the standard indices. Due to this additional aspect of 


?3Schiiler and Weisbrod (2010) show that the effect for countries whose ethnic fragmentation is mainly 
due to high immigration (e.g., Australia) is less detrimental. Similarly, with well established institutions 
the negative effect can equally be mitigated (Alesina and La Ferrara, 2005; Easterly, 2001). Contrary to 
the above literature based on cross-country analyses, there are some articles focused on (metropolitan) 
regions and companies that document positive effects of diversity. This is mainly attributed to the impact 
of ethnic diversity on the degree of innovation and consequent increase in productivity. Ottaviano and Peri 
(2005), Ozgen et al. (2011a,b) and Sparber (2010) confirm productivity increases at the regional level and 
for selected countries. Regarding companies, Prat (2002) shows, in a game theoretical analysis, that the 
positive impact of a heterogeneous versus a homogeneous team depends on the complementary nature of 
their tasks. A comparable result is also found in Hong and Page (1998). 
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ethnic diversity covered in the new index, it performs differently in explaining a range of 


the aforementioned economic implications. 


Dynamics and drivers of ethnic fragmentation 


Chapter 1 Contrary to biodiversity, where the dissimilarities between its elements are 
crucial, most of the economic literature uses the measure of ethno-linguistic fractionaliza- 
tion (ELF). Starting out with this concept, the first chapter summarizes the theoretical 
discussions on the dynamics of ethnicity. Obviously, ethnicity is not a static concept as 
ethnic identification and its translation into ethnic groups is subject to change. This is 
transferred into a theoretical model that provides a close connection to the index of ethno- 
linguistic fractionalization (ELF). It shows that countries are generally not faced with a 
continuous trend to become more homogeneous, instead illustrating that they may well 
retain their level of ethnic fragmentation or even become more heterogeneous. 

The main contribution of this chapter is twofold. The present literature almost com- 
pletely excludes the dynamics of ethnicity from analysis, treating it as exogenously given. 
Introducing a clear motivation within this framework of the dynamic nature of a coun- 
try’s ethnic setup challenges this basic assumption. The model is constructed in a way 
that simulates the adaptation of the ELF index. This offers a better understanding of 
the applicability and possible interpretation of the ELF index, especially regarding endo- 
geneity problems. Secondly, it outlines specific drivers responsible for these adaptations. 
Beginning with a specific group constellation, economic development and education drive 
homogenization. In contrast, migration and a more profound integration into the interna- 


tional economy through trade at least retains, or increases the given level of heterogeneity. 


Chapter 2 Building on the theoretical foundation of the previous chapter, the second 
chapter proofs the drivers of ethnic fragmentation empirically. It is in line with recent 
contributions outlining initial ideas as to why different levels of ethnic fragmentation have 
evolved. These are mainly based on biodiversity and evolutionary theories and show again 
the close connection between both kinds of diversity. It confirms the results that a ‘base- 
level’ of fragmentation evolved due to geographical and evolutionary factors. 

A new contribution is the closer examination of the role colonization plays in in- 
fluencing the levels of fragmentation, especially regarding how a country was colonized. 
Countries where colonial powers did not have any incentive to settle and build good in- 
stitutions, instead exploiting the country’s resources, show a significantly higher level of 
ethnic fragmentation. 

The most important contribution of this chapter is to highlight the changes in the 
ethnic setup over a rather short time frame. Although migration is the most obvious 
factor, urbanization and education in particular play an even more important role in 


influencing a country’s ethnic setup. 
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Measuring ethnic diversity and assessment of its implications 


Because the ELF index is the most widely used measure, it was used for a closer analysis 
of the dynamics of ethnicity and directly applies to the broad range of papers building 
on it. Its selection is driven additionally by data availability, as consistent data for two 
points in time was uniquely available for ethnic fractionalization. However, the dynamics 
of ethnic fractionalization can be easily transferred to the case of ethnic diversity. The 
identification with a specific group, affecting the relative group sizes and thus the level of 
fragmentation is also a key building block of any diversity measure. Refraining now from 
the more limited concept of ethnic fractionalization presented in the first two chapters, 


the second part of the dissertation is dedicated to ethnic diversity. 


Chapter 3 For any diversity index the introduction of distances between groups is 
essential. For an appropriate diversity index, a combination of different characteristics 
measured in a consistent way is used. Language, ethno-racial and religious characteristics 
are combined for a composite similarity value. The resulting distance adjusted ethno- 
linguistic fractionalization index (DELF) is based on an extensive amount of group data, 
covering a wide range of countries. Whereas ethnic fragmentation (ELF) only contained 
meaningful information for single countries, the DELF index can also assess differences 
between countries, where it fills an even bigger gap. 

The new diversity index, DELF, contributes in various ways. It uses a very detailed 
data source, containing more than 12,000 groups defined along all three characteristics. 
Finally it offers, by applying the equivalent approach as that of the diversity measure for 
single countries, an assessment of cultural differences between countries. As the new index 
measures a country’s ethnic diversity, it is a good starting point to review some of the 


existing approaches linking ethnicity to economic outcomes. 


Chapter 4 Developing a new index without testing its applicability is of limited merit. 
That is why this last chapter offers a range of applications for the DELF index. For many 
economic problems, it is not the pure quantity of (relative) groups which is of interest, but 
the difficulty of coordination or instrumentalization between these various groups. This is 
crucially dependent on the differences between those groups and not only on their mere 
existence. 

By replicating some established analyses, the DELF shows good applicability for 
conflict incidence compared to the often used index of polarization (POL). For growth, 
it confirms the commonly found detrimental effect. In an extension of the analysis of 
Alesina and La Ferrara (2005), a positive effect of the DELF, dependent on a country’s 
general level of development, is found — which is not the case for the ethno-linguistic 
fractionalization (ELF) index. Therefore, it is not about being able to ‘afford’ diversity 


in money terms; a broader level of development with higher education and health levels 
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seems to be a prerequisite to harvest the positive implications of diversity and to break 
away from the vicious cycle most of the previous literature alluded to. 

Furthermore, the DELF is tested on its applicability as a measure of cultural distance 
between countries. It can be shown that higher ethnic diversity between countries reduces 
the positive relationship between them. The general trust within countries, however, is not 
affected by a higher ethnic diversity. Finally, the DELF is a valuable measure for cultural 
affinity between countries, which affects trade flows positively. Overall, it substitutes a 
broad range of affinity proxies very well and its broad global coverage asks for a wider 


adoption. 


Outlook The results of the first two chapters on the roots and dynamics of ethnicity, 
have two implications for further research. It is a strong basis to refute the common as- 
sumption of its static nature. This raises the problem of ethnicity’s endogeneity, at least 
for studies spanning several decades. In general, this does not question their results but 
adds a caveat for their interpretation. Secondly, it offers a deeper understanding of the 
nature of ethnic heterogeneity — a variable that has rightly become more and more impor- 
tant in economic analyses. Understanding the roots and driving factors of the dynamics 
of ethnicity is crucial for any meaningful further research 

The second part of this dissertation has an even stronger implication. The introduction 
of the DELF index allows one to assess ethnic diversity based on multiple characteristics 
within, and between countries covering nearly the entire globe. The mere quantity of 
cultural backgrounds is of less importance than their diversity, and thus higher comple- 
mentarity to fuel innovations and productivity. 

The call for a rising awareness of ethnic diversity does not originate from a romanticized 
view of the world which is disconnected from further development and globalization. There 
will be a further loss of languages and traditions, reducing ethnic diversity — just like the 
extinction of a species reduces biodiversity. Evolution will bring about new languages and 
traditions — as new species develop adapting to a changing environment. Equally, societal 
costs in preserving a higher level of diversity will always accrue — as they do for biodiversity. 
This, however, does not call for more assimilation in order to avoid these costs, but for the 
strengthening of institutions and improvement of prerequisites encouraging the reduction 
of the costs of diversity, and capitalizing earlier on the positive returns it can bring. 

Having a better understanding of ethnicity’s impacts and a better set of tools for 
its analysis is an important step for putting the claim of a clash of civilizations into 
perspective. This is even more important during the current times of economic downturn, 
with nationalistic parties on the rise globally. They refer to and exploit the potential 
societal costs of cultural diversity without balancing it out with its prospective benefits. 

Endowed with a deeper understanding of the dynamics of ethnicity and a crucial new 
measure of ethnic diversity, I encourage more research in this field to gain more insight 


into its implications on economics and the broader development of countries, thus offering 
Philipp Kolo - 978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


Chapter 0. Introduction and Overview 11 


better coverage of its full impact. Ethnic diversity, as is the case with biodiversity, is not 
a necessary evil constraining the development of countries by burdening them with high 
societal costs, but something worth preserving, with its benefits eventually outweighing 


its costs. 
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Chapter 1 


The Dynamics of Ethnicity 


1.1 Introduction 


In the empirical literature, the role of culture and ethnicity in the economic context is at- 
tracting more and more attention.?^ This literature, however, almost completely excludes 
the dynamics of ethnicity from the analysis, instead treating it as exogenously given. This 
is largely due to data constraints that impede tracking the dynamics of ethnicity. Still, 
as ethnicity is increasingly becoming a key variable for many current research strands, a 


better understanding of its dynamics is fundamental for the interpretation of the results. 


Despite the understandable limitations of the empirical literature, there are also only 
limited efforts by economists to approach these dynamics on theoretical grounds. A few 
exceptions offer some motivation for the dynamics of changing ethnic boundaries.?? Al- 
though these models try to offer a better understanding of decision processes to migrate, to 
offer differentiated public goods or to fully assimilate, they lack a clear link to the growing 
empirical literature. In particular, there is no link to the applicability of the dynamics 
found in the models to the empirical operationalization of ethnicity. However, for empirical 
analyses and the interpretation of their results, it is crucial to have a clear understanding 
of if, and why ethnicity should be subject to changes, and what its potential drivers are. 
Based on an extension of the model by Lazear (1999), this essay shall provide this link, 


as well as providing a starting point to test these dynamics empirically. The model shows 


24For a more detailed overview, see, for example, Alesina and La Ferrara (2005) and Garcia-Montalvo 
and Reynal-Querol (2003). For a broader discussion on the different concepts of ethnicity and its opera- 
tionalization, see Brown and Langer (2010). 

?^ Constant and Zimmermann (2007) discuss in a simple framework the main assimilation strategies of 
immigrants. Bodenhorn and Ruebeck (2003) model and analyze the emergence of mixed ethnic groups in 
the United States in order to improve their economic position. Darity et al. (2006) use an evolutionary 
game theory model to show different ‘acculturation’ outcomes and Caselli and Coleman (2008) analyze the 
decision to change group membership within a model of ethnic conflict. Ahlerup and Olsson (2007) build 
their model on kinship-based social organization providing public goods. Finally, Lazear (1999) models 
assimilation processes of language groups to sustain or ameliorate trade. Subsequently, Kónya (2005) 
discusses the implication of multiculturalism versus a melting pot. 
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that the group constellation, the costs of learning another language, economic growth, and 
immigration rates all significantly influence a country’s heterogeneity. 

The remainder of this chapter is organized as follows. In section 1.2, a general discus- 
sion of the definition of ethnicity is carried out, as well as describing the key aspects of the 
main theoretical models. Section 1.3 extends the language assimilation model of Lazear 
(1999) and describes the individual strategies in a state of autarky. In section 1.4, indi- 
vidual behavior is aggregated to the overall society, and the resulting dynamics towards 
possible equilibria are analyzed. Here, the link to the broadly used ethno-linguistic frac- 
tionalization index (ELF) is discussed. Section 1.5 outlines some basic further extensions of 
the model predictions in relation to globalization and international trade. Finally, section 


1.6 summarizes the key findings, concludes and gives an outlook for further research. 


1.2 General attributes of ethnicity and key models 


For interaction to occur between different individuals, a common factor, like origin or 
language, is assumed to be necessary to signal and establish a certain level of common 
ground (Leeson, 2005). This is less important in countries, where institutions and codified 
laws replace this signaling. In discussing these common markers, the economic literature 
mainly focuses on three characteristics: ethnicity, language and religion. They offer rather 
clear (observable) definitions that involve certain costs in order to be changed or adapted.”° 
This, however, does not answer the question about what has shaped or constantly shapes 
ethnic identities and groups. Three main approaches are discussed in the literature to 
explain these dynamics: the primordial, the instrumentalist and the constructivist.?" 
Primordialism views ethnic identities, and thus their group structure, as rather fixed, 
showing a long historical continuity. Smith (1986) summarizes the primordialist view by 
maintaining that *ethnic communities are the natural and integral elements of the human 
experience," and he regards “language, religion, race, ethnicity and territory as the basic 
organizing principles and bonds of human association throughout history" (Smith, 1986, 
p. 12). Similarly, Young (1998) describes the primordial dimension of ethnicity as an “in- 
ternal gyroscope, [a] cognitive map and dialogic library through which the social world 
is perceived" (Young, 1998, p. 6). Likewise, van den Berghe (1981) sees ethnic groups as 
nothing but an extension of the concept of kinship. The nepotistic behavior can be ob- 
served in all mammal species and is the result of an evolutionary survival strategy. Living 
in an environment with only limited resources, sticking with your kin leads to “greater 


reproductive success and tend[s] to dominate all populations" (Ahlerup and Olsson, 2007, 


6See, for example, Bruk (1964), Alesina et al. (2003), and Fearon (2003), who build their measures on 
the combined taxonomy of ethno-linguistic groups combined with other characteristics such as language, 
ethno-racial belonging or religion. 
27For an extensive overview of these three analytical approaches, see Brown and Langer (2010) and 
Le Vine (1997) for a more critical review. 
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p. 6). As these kinship groups grew, they developed common (cultural) traits or markers 


to sustain the structure for a more extended group. 


Instrumentalists basically build on the primordial structures but emphasize that sev- 
eral of these aspects might be differently and selectively activated in different situations. 
Ethnic groupings thus emerge around an identity causing characteristic, which then in- 
creases collective interest. This holds true for social and political interactions especially. 
Consequently, the idea of social stratification and the emergence of political elites that 
leverage ethnic identifications to mobilize supporters at a rather low cost are closely linked 
(Bates, 2006).25 Young (1998) concludes that ethnicity is shaped “in everyday political 
and social interaction, ethnicity often appears in instrumental guise, as a group weapon 


in the pursuit of material advantage" (Young, 1998, p. 6). 


On the other hand, more recent factors and the emergence of nations have also left 
their traces on the development of ethnic groups. According to this constructivist view, 
major changes in the structure of human interaction arose through the development of 
modern nation states. Subsequently, the formation of nations and modern states shaped 
and changed group construction and identification drastically.?? Finally, Young (1998) 
summarizes that ethnic *identities are socially constructed, a collective product of the hu- 
man imagination" (Young, 1998, p. 6) and are constantly reshaped in the mutual exchange 
between the various groups and identities within a globalized world. Thus, constructivists 
believe external forces and the interaction with other groups to be responsible for the 
definition and attribution of one's ethnic group, which finally becomes a social construct 


of a society itself.°° 


Fenton (2010) combines all three aspects excellent, thus painting a more accurate pic- 
ture of the complex nature of ethnicity?! In his view ethnicities are “grounded as well as 
constructed. Ethnic identities take shape around real, shared material experience, shared 
social space, commonalties of socialization and communities of language and culture. Si- 
multaneously, these identities have a public presence; they are socially defined in a series 
of presentations (...) by ethnic group members and non-members alike” (Fenton, 2010, p. 
201). Thus, ethnicity as such contains some irrevocable core characteristics that represent 
the most essential characteristics of a group, whereas other parts of the ethnic identity 
might be subject to change. Having combined the different views of ethnic emergence, 
according to Fenton its activation is due to *the degree of collective self-consciousness and 
thus in the extent to which individual and collective action is calculated or instrumental in 


the pursuit of ethnic ends" (Fenton, 2010, p. 201). So not every ethnic identity is activated 


?5 For some African case studies and an empirical investigation of how political competition affects ethnic 
identification, see Eifert et al. (2007). 
?9? Miguel (2004) shows, using an African example, how nation building changed the affiliation to tribes. 
3°For an effort to predict the social construction of ethnic identities, see Chai (2005). 
314 comparable approach is taken by Wimmer (2008), who develops a framework to explain which of 
these theories are activated and, to a lesser extent, to explain the superiority of a single one. 
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in the political or societal arena, and this might differ due to a wide range of (contextual) 


reasons. 


Despite these discussions on what brings humans to identify along ethnic structures 
mostly led by non-economists, some theoretical discussions in economic models have re- 
cently emerged. The major models shall be briefly discussed here to give an overview of 
the current status of this strand of research, and also to make the differences compared to 


the model outlined in this chapter more clear. 


Bodenhorn and Ruebeck (2003) model and empirically test the changing assimilation 
of blacks with lighter complexion into mixed ethnic groups in the mid-nineteenth century 
in the US. They balance the improved economic opportunities emerging from a higher 
degree of acceptance by the economically dominant white group with the (implicit) costs 

adopting and maintaining white culture. Additionally, the lighter complexed blacks 
ten faced punishment by abandoning one’s former group. To differentiate themselves 
nd to exclusively retain the monetary gains, they formed a comparable parochial group 


*mulattos'?? 


Darity et al. (2006) and Caselli and Coleman (2008) build their models on the exclud- 
ability of individuals from a group based on phenotypical attributes. Darity et al. (2006) 


o ® O OQ 


use an evolutionary game theory model. The decision to follow either an racialist or an 
individualist strategy form different ‘acculturation’ outcomes.?? Finally, they argue that 
the construction of rigid racial identities and the cumulative effects of racial exclusion 
lead to a wealth differential between different ethnic groups (mainly between blacks and 
whites in the US). Whereas Darity et al. (2006) try to explain the general adoption and 
austerity of cultural traits, Caselli and Coleman (2008) analyze the decision of individuals 
to change group membership. If several groups quarrel over some expropriable resources 
of a country, the identification of group membership plays a crucial role in the incidence 
and severity of conflicts. The decision to start a conflict or war is for a considerable part 
based on the possibility to exclude the adversary group from future gains, e.g., a country’s 
natural resources setup.?4^ This not only introduces potential group change as a strategic 
thought but as a direct possible incentive for an individual to pursue such a change. 
Ahlerup and Olsson (2007) outline a model of endogenous group formation. They build 
a model on kinship-based social organizations providing public goods. Differences between 
the main types of living in ancient times (hunter-gatherers versus sedentary agriculture), 
the role of statehood, and differences between core and periphery influence secession and 


ractionalization tendencies are all analyzed. In contrast to the other models, the key 


32For an additional model on restrictive cultures strongly regulating members’ behavior and specifically 
imiting social deference, see Carvalho (2010). 
33 According to the notion of Darity et al. (2006), an individualist is a person who does not identify and 
comply with the expected behavior of one's ethnic group. 
34%) their empirical analysis, Collier and Hoeffler (2004) and Collier et al. (2009) follow the model 
predictions of Caselli and Coleman (2008), according to which greed and thus opportunity is responsible 
or the incidence of ethnic conflict rather than grievance. 
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contribution here is that it covers the emergence of new groups, instead of the changes 
between existing groups and assimilation looked at in most of the other models. 

Lazear (1999) uses a single stage trade model to analyze language assimilation between 
minority groups and the majority group. The more a country is split into one clear 
majority and one or several small minorities, the faster one observes the assimilation 
process, because their value from assimilation is higher.? The model outlined in this 
essay is based on this basic trade model. 

Kónya (2005) builds on Lazear (1999) and offers two interesting extensions. In his 
model, a social planner is introduced to answer the question of whether multiculturalism 
or a melting pot was the best solution. Additionally, Kónya (2005) introduces the role of 
different population growth rates, internally or externally through migration. Differences 
in these rates can lead to stabilization in the assimilation of the (immigrant) minority 
group, and even to the displacement of the host’s majority group. Both aspects will also 


be treated by the model outlined here. 


1.3 Basic model 


The following model captures when and why language assimilation takes place within a 
country. It consists of two stages instead of only a single one, as is the case in the orig- 
inal model of Lazear (1999).96 For interpretation and discussion, Lazear's interpretation 
of ‘language assimilation’, i.e., learning of an additional language, is consistently used. 
Although the model considers linguistic groups, a broader applicability of the model to 
ethnic or cultural groups is obvious.?" 

First of all, the focus is on the decision of an individual in a single country, who speaks 
only one language. In the first stage, every individual can decide either to search for a 
trading partner or to invest his time into learning another language. Trade will only take 
place if two individuals meet who speak the same language. Learning a new language does 
not result in any trade in the first period, but increases the possibility of meeting a trading 
partner in the second period, because the individual now speaks an additional language. 

The respective relative population shares of the k language groups within this country 
are given by py with g € {1,...,k}, pg € [0;1] and Xf; pg — 1. Any trade is assumed to 
yield the same value for all individuals and is normalized to a value of one. If individuals 


decide to engage in trade, the random matching process of monolingual individuals leads 


35The same dynamics were modeled by Church and King (1993) and Selten and Pool (1991) on game 


theoretic grounds. Whereas the first is more similar to the approach followed here, the second opens up 
the possibility of bilingualism but remains more general in its conclusions. 

36 This extended approach is drawn exemplarily from Galor and Zeira (1993). In their model, an indi- 
vidual has the choice to pursue unskilled work in both periods or to invest in education in the first period 
and to pursue skilled labor in the second stage. The resemblance to my model is obvious. 

37 For a discussion on the congruency of language and culture, see Chong et al. (2010). Falck et al. (2010) 
also conclude that “language differences are probably the best measurable indicator of cultural differences” 
(Falck et al., 2010, p. 30). In the following, it becomes clear that the processes and dynamics found for 
language can easily be transferred to other characteristics of ethnicity. 
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to a probability of trade equal to pg — the expected probability to meet another individual 


of its own group.?? 'The expected revenue (Rg) of all individuals of group g is then: 
E|Rj —p,:1—pg foralli € {1,...,k} (1.1) 


Consider the groups g and h, where pg < pp. It follows that E[R,| < E[R,]. Individuals 
of smaller groups thus have a general incentive to assimilate into bigger groups in order 
to increase the probability of trade.?? 

The effort an individual needs to make in order to learn another language is defined 
as a function of learning costs b(@,a;). The costs depend on a general costs parameter 0 
and on the ability level a; of each individual ;.49 

0 defines the general cost function for each group member to learn another language, 


depending on the differences between one’s own language and the one to be learnt.*! The 


more similar two languages are, the more straightforward it is to learn them. This is 
reflected in a lower 0, and 0b,(6,a;)/00 > 0 applies. Lazear (1999) describes the costs 
b(0.a;) as the efficiency to learn another language.? This can easily be linked to the 
education level of each group. With no education, it is much harder to learn another 
language than with a certain level of education already attained. More specifically, Ogn 
affects the cost function of all individuals of group g who learn the language of group h, 
given by bg (Agn, ai). 

Finally, each group consists of individuals with different levels of ability a;. From 
0b(0,a;) /Oa; < 0 follows that higher individual ability lowers the costs of learning. This 
implies that it is not equally wise for all individuals to learn a new language.? However, 
the ones with the highest ability would decide in favor. The distribution function of a; 
within the respective group g is given by P,(a;). Consequently, the proportion of group g 
that learns the language of group h is then given through Py. 


38 \ more general characterization of ‘communication benefits’ is used by Selten and Pool (1991). They 
are directly influenced by the size of the language community. 

39 This can also be interpreted as access to the bigger market and to higher possible revenues. A compa- 
rable result is found by Christofides and Swidinsky (2010). They show for Canada that the wage premium 
or learning an additional official language is much higher for the minority French-speaking group than for 
he majority English-speaking group. Chiswick and Miller (2007, Ch. 3) discuss the decision of immigrants 
in multi-language country surroundings which language to acquire, proving a general preference for the 
majority language. A broader overview of studies that assess wage premiums of additional languages up 
o 2096 is found in Ginsburgh and Weber (2011, Ch. 5). 

40The same split into these two factors influencing an individual's cost function is applied by Selten and 
Pool (1991). 

4 Chiswick and Miller (2007, Ch. 20) assess the distance between two languages for a broad set of 
anguages in accordance with the difficulty to learn English. For a broader overview of different methods 
o assess the distance between languages, see Ginsburgh and Weber (2011, Ch. 3). 


4216 Ogn covers only the differences between languages, the symmetry assumption gp = Ong is reasonable 
o use. If it represents the overall efficiency or ease to learn another language, it is obvious that it might 
be easier for group g to learn h than the other way round and 9,7 # ng, for example, due to differences 
in the educational level between both groups. 

43Based on these insights Ginsburgh et al. (2007) subsequently construct demand functions for languages 
within the European Union based on individual second-language learning costs (Gabszewicz et al., 2008). 
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Table 1.1 summarizes the potential pay-off in the first round for an individual of group 


g depending on the choice whether or not to learn the new language of group h with 


Pg < Ph- 


No new language | New language 
First stage Pg —bg (Ogh, ai) 
Second stage Pg + P% (p; + Er) T (pn + Pi) 


Table 1.1: First round pay-offs per stage and depending on the decision taken 


In the first stage, not learning a new language and deciding to trade leads to an expected 
pay-off of E[R,| = pg. Learning the language of group h for an individual i of group g, 
accrues costs of b, (0,5, a;). In the second stage, some adaptations in the group composition 
arise due to the decisions in the first stage. P; is the proportion of individuals of all other 
groups that assimilated into group g. Therefore, the potential for trade for an individual 
of group g might increase by the amount d 4 The individual of group g who learnt 
the language of group h can now communicate in the second stage with individuals from 
group h. The share of all other individuals that learnt either the language of group g 
or group h, indicated by P7 and Pý, additionally applies. The choice to learn another 
language is always an individual one, i.e., collusive action within a group is not possible 
in this model. An individual i of group g will decide to assimilate into another group h 


with pg < pn if: 


(pg + Pj) + (Pn + Pr) — bs (855,8) 2 Pg + (Pg + P7) (1.2) 
= (pn + Pr) —ba(Ogn,ai) 2 pg 
e (pg. pn; 0) :— (pn — pg + Ph) — bg (Agr, 0%) > 0 (1.3) 


Without making any additional assumptions, this already leads to some insights. For 
a given and equal level of a; and 0 across all groups, individuals in the smallest group 
have the highest incentive for assimilation. The bigger a minority group gets, the lower 
the probability of assimilation of its members becomes. When the first wave of migrant 
workers arrived in Europe, they represented a very small minority and had high incentives 
to assimilate in the respective new country's main group. As these groups grew over the 
years, the need and incentive to do this decreases.*% Upon approaching close to an even 


split of the two groups, it is highly unlikely that any assimilation will take place, but the 


^ However, with the definition pg < pp this is rather a theoretical possibility. 

45 Additionally, to already include the outcome of the second stage in the considerations of the first, one 
normally needs to discount the second stage. Since it will not change the general outcome, it is abstained 
from doing this for the sake of simplicity. 

46Danzer and Yaman (2010) find a negative effect of the size of the immigrants groups on the host 
country's language proficiency in Germany. Additionally, immigrants with high learning costs prefer to 
move to enclaves of their own descent. 
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status quo will persist. On the other hand, the higher the share p; of the bigger group 
is, the more individuals from minority groups will assimilate. This tendency is intensified 
even more if other individuals learn the language of group h, Př. The harder it is to learn 
the other language, bg(gn,ai), the lower the potential for assimilation. These dynamics 


can easily be seen by differentiating (1.3): 


8f /Op, <0 
8f [Opy » 0 
af /OP% »0 
8f Ob, <0 


In highly fragmented countries, the revenue differential (pr — pg) between two groups might 
be too small for any group to assimilate, and no changes would take place. Thus, there is 
no general trend to more homogeneity, with very fragmented countries possibly persisting 
as well. 

For any country, its development level and economic growth is highly important. That 
is why the parameter s(y) 7 1 is introduced. It covers more generally the country-specific 
specialization or level of development.*” Depending on its stage of economic development 
y; à higher s(y) describes a more developed country. If only a few goods are traded in 
a country and all groups rely on subsistence farming, there is no need to conduct trade 
amongst each other, let alone with other groups. With an increasing trade volume, it is 
more interesting to participate, as more is at stake. A higher s(y) thus describes the overall 
trade volume that is available in each round. We may therefore expect 0s(y)/Oy > 0. 
Introducing s(y) in Equation (1.2), leads to:*® 


s(y)- [(na + P7) + (Pn + P)| - bo (0gn>ai) > s(y)- [va + (Po + P7) 
e (y) - (Pn Pg + Pi) (8.25) > 0 (1.4) 


Comparing (1.3) and (1.4), one can clearly see that with a rising level of development, 
the return split would increase, and assimilation into the majority becomes more likely. 
Starting with a very fragmented country (small py and pn), potentially no group has an 
incentive to assimilate, as the return split is still too low. With an increasing level of 
development, assimilation would become more likely, and countries should tend to homog- 


enize in this process. Very fragmented countries, paired with lower levels of development, 


A Lazear (1999) points out that it is not entirely random for individuals to meet in a country, but one 
can normally observe a grouping of peers or a segregation of groups. Lazear (1999) describes this as 
specialization. A very remote tribe living in autarky, would produce all that it needs and one would not 
find any specialization between groups. In less developed countries the contact between various groups is 
often additionally constrained due to poorer infrastructure, which fortifies the remoteness of these groups 
or tribes. For more developed countries, specialization and thus trade possibilities increase in importance. 


48 Additionally, it is assumed that the costs b5(0,a;) are independent of a country's specialization level 
s(y). 
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would, however, be expected to remain in a very fragmented equilibrium. With rising 
development, a continuous process of assimilation into the majority group is expected to 


take place. 


1.4 Country equilibria 


Having discussed the decisions of individuals regarding assimilation into a new language 
group, we can proceed to analyze the country equilibria. This is necessary to answer the 
main question as to whether, and how a country’s heterogeneity is changing. Before we 
can draw these conclusions, some additional assumptions need to be discussed and be 


further specified. 


1.4.1 Theoretical considerations and base scenario 


The processes outlined in the previous section will be repeated for several rounds. After 
the first two stages, the group sizes change, and therefore decisions in a new round face 
changed revenue and cost considerations. After each round, i.e., the two stages, each 
individual has to decide which of the two languages he will use in the next round, so that 
at the beginning of each round only monolingual groups exist. Although that is a strong 
limitation, it has some obvious arguments to follow. The ethno-linguistic fractionalization 
index (ELF) commonly used in the econometric literature does not allow for bilingualism, 
with an individual only able to belong to one specific group. The ELF index is calculated 


like a Herfindahl-Hirschman concentration index, with: 


k 
ELF-1-M p (1.5) 
g=1 


Its value moves between zero and one and represents the probability that two randomly 
selected individuals from a population come from different groups. A higher value thus 
indicates a more fragmented country. As this essay aims to provide a model interpretation 
of the expected dynamics of the ELF index, it seems fair to adopt the above assumption. 
An even more plausible interpretation seems to be that parents need to decide in which 
culture or language they want to raise their children, because learning a language and 
settling in a new group takes time and effort.^? As every round represents a new generation, 
this correlates with a certain inertia towards assimilation processes in this regard. In the 


model, as well as in reality, adaption processes to change one's culture or language are 


49Tn an extension of his model, Lazear (1999) also discusses the decision of parents of which language or 
culture will be passed on to their children. For a more detailed discussion on the vertical transmission of 
culture and language, and an explicit model of these decisions, see Bisin and Verdier (2001) and Ashraf 
and Galor (2007). Additionally, Aspachs-Bracons et al. (2007b) show in a case study for Catalonia that 
irrespective of the parents' origin school children identify with the Catalonian identity the more intense 
they were taught in Catalan. This is carried forward in their political behavior (Aspachs-Bracons et al., 
2007a). 
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not subject to instantaneous changes.” 


This assumption therefore has an additional 
implication. By deciding what language to activate in the next round, each individual 


would decide to follow the bigger group.?! 


Since we are now in the dynamic process, all individuals that learnt another language 
and decided to no longer be in the original group need to be accounted for. These are 
indicated by E, which is the net change of individuals speaking language g. Thus, B, 
can be positive or negative. With g « h, P, is assumed to be negative as one expects that 
more individuals of group g decide to assimilate into group h than vice versa. Covering 
the discussion above, the adapted pay-offs for all subsequent rounds are outlined in Table 
1.2: 


| No new language | New language 


First stage Dg t+ P, —Dg(Agn; ai) 


Second stage Pg + P. TET. (p, + P; + 12) + (pr + P, + Px) 


Table 1.2: Individual pay-offs for all subsequent rounds 


In order to keep the modeling as simple as possible, a small restriction is taken into 
account, i.e., only assimilation into the majority group m is further discussed and pm = 
Ph = maz {pg}. This is, however, not too restrictive. Learning a new language incurs costs 
of bm(,a;) > 0 for a member of the majority group m. However, the major additional pay- 
off in the second stage does not present a big increase in pm but only in the smaller group py. 
Thus, it would not be reasonable for a member of pm to assimilate into a smaller group, as 
costs accrue and the additional revenue py is by definition lower than pm as Pm = maz (pg. 
'Thus, majority members do not learn a new language and receive a potential pay-off of 
2- (p, +P.) + P!", benefiting from all other individuals that assimilated into the majority 
group.?? 


The distribution functions G,(a;) of a; within the respective group g might be similar 
across all groups but they can also be different. The distribution function for the model 
is given by a Beta distribution. The advantage of this distribution is that it offers a 


continuous probability distribution over the ability interval a; € [0,..., 1]. 


9? Belloc and Bowles (2009) explicitly model this inertia in their game theoretic analysis of the impact of 
cultural conventions on the production and trade of products, where complete contracts are not feasible. 
Guiso et al. (2007) offer some examples as to how historical events and institutions due to this inertia 
affect trust between countries. 

51 This directly follows from Equation (1.1) and the discussions in the previous section. 

52 Although the majority group pm would by definition lead to the highest possible revenue, individuals of 
group g might decide to assimilate into another group h with pm > pn > pg. Despite the lower additional 
revenue Pm > Ph, Ogn might be sufficiently lower than gm, and some individuals of g might decide to 
assimilate into h instead of m. In the base analysis below, 0 is assumed to be equal for all groups, thus 
this option will not apply. 
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Additionally, some further specifications regarding the costs function b, (0, a;) are taken. 
In the following, the cost function for an individual of group g who wants to learn the 


language of group h is given through: 
b,(055,a;) = logg p (ai) (1.6) 


This cost function is defined over the whole range of ability levels and fulfills all major as- 
sumptions.” For higher ability levels, the costs are decreasing (3b/ða; < 0). Additionally, 
it holds that 0b/00@,), > 0. The more different two languages are, the higher the costs are 


to learn another language for all ability levels.54 


Costs 


0 01 02 03 04 05 06 07 08 09 lai 


^ ^ Cost function for Log 0.2 ——Gia_i)for Beta (2.3) 


Figure 1.1: Density function for selected B(2,3) distribution and cost function with 0 — 0.2 


The modeling is exemplarily based on three groups with pa = 0.20, pp = 0.30 and pc = 0.50, 
corresponding to an ELF value of 0.62. The ability levels are distributed in all groups with 
a Beta (2,3) distribution. For simplicity, 0 is similarly assumed to be equal for all groups, 
with 0 = 0.20. The density function of ability levels and the corresponding cost function 
are depicted in Figure 1.1. 

In every round, the individuals of each group make the decision whether or not to learn 
a new language as outlined in Table 1.2. At the start of every new round, the ELF value 
of the new constellation is calculated. 
The relative group sizes and the ELF value are shown for 15 rounds (or generations) in 
Figure 1.2. Both smaller groups assimilate more and more into the majority group, raising 
the majority’s share as a result. The revenue split (pm — pg) is constantly increasing. Thus 
it will become more and more reasonable for lower ability levels a; to cover the individually 


increasing costs of learning. After 15 rounds, the group split levels out for p4 — 0.08, 


535 T6 be precise, it must hold that a; €]0,...,1] and 6,;, €]0,...,1[ for the cost function to be defined. As 
the definition boundaries would lead to either no costs at all or infinite costs regardless of the ability levels, 
excluding these two values does not constrain the model. 

9^Mathematical details and some graphical examples of the cost function are given in Appendix A.1 and 
A.2. 
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ELF, 
group sizes. 09 


0.0 Rounds played 
12 3 4 5 6 7 8 9 1011 12 13 14 15 (generations) 


——Country ELF — *GroupA(p A) = Group B(p_B) +++ Group C (p C) 


Figure 1.2: Results of dynamic modeling per round 


pp = 0.13 and pc = 0.80 as for some individuals costs always exceed the revenues of an 
additional language.?? This leads to a significantly lower ELF value of 0.34. On the path 
to the equilibrium, the smaller group A shows a faster assimilation process, due to the 


higher revenue differential (po — pA), than the medium sized group B. 


1.4.2 Implications of group constellations and cost assumptions 


The previous section made it clear that the respective group constellation is relevant to 
the achieved equilibrium. From the construction of the ELF, it is obvious that different 
group constellations can lead to the same ELF value.?9 Figure 1.3 shows the equilibrium 


path for three different group constellations with equal ELF levels at the beginning. 


1.0 
09 
0.8 
0.7 
0.6 
05 
04 
0.3 
0.2 
0.1 
0.0 Rounds played 

1 2.3 4 5 6 7 8 9 10 11 12 13 14 15 (generations) 


ELF 


——ELF (0.15.0.18.0.67) — — -ELF (0.07.0.29.0.64) == ELF (0.43.0.01.0.56) 


Figure 1.3: ELF values for different group constellations 


55The relative group sizes do not add up to 1 due to rounding. 
56This is also one of the main criticisms of the ELF, as its value does not entirely reflect the distribution 
of the groups. For more details, see Appendix A.1.2. 
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All trajectories start with an ELF of 0.5. In the first case, with two groups of the 


same size and one majority group (pa = 0.15, pg = 0.18, pc = 0.67), individuals from both 
smaller groups have an equal incentive to assimilate. After a fast assimilation process, 
an ELF of 0.16 is approached. In the second case (p4 = 0.07, pg = 0.29, pc = 0.64), 
most of the individuals from the smallest group assimilate. The medium-sized group is 
already quite large, so fewer individuals decide in favor of assimilating into the majority 
group. Only after an increase of pc, due to the assimilation of py, do more individuals 
of pp assimilate. This leads to a slower assimilation process. The ELF value reached 


after 15 rounds is 0.19. In the final case, where a high polarization of two big groups is 


found (pa = 0.43, pg = 0.01, pc = 0.56), there is not much movement between the groups, 
leading only to an marginally decreased ELF of 0.49. 
Turning back to the initial group constellation with pa = 0.20, pg = 0.30 and pc = 0.50, 


the influence of a changing 0 is shown in Figure 1.4. 


ELF 


0 nn 


- 
0 01 02 03 04 05 06 07 08 09 1 


—ELF values 


Figure 1.4: ELF values after 15 rounds depending on 0 values 


Obviously, with 0 tending to zero there are literally no costs at all in learning another 
language, and all individuals, irrespective of their ability level, would assimilate into the 
majority. The ELF value thus becomes zero, indicating a completely homogeneous country. 
With a rising 0, the costs increase rather quickly? For higher values of 0, the final ELF 
value tends to the start value — in the example, tending to an ELF value of 0.62. In this 
case, it is not feasible for any individual to change groups, as the costs are always too 


high. 


1.4.8 Implications of economic growth and immigration 


So far, the role of a growing economy, as covered in the s(y), was neglected. In the 
following, it constant economic growth is assumed, i.e., a continuous increase in overall 


economic activity. This scenario is depicted in Figure 1.5. 


57For the mathematical details of Ob/00, see Appendix A.1.1. 
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Figure 1.5: ELF values for different growth rates 


Growth clearly increases the speed and scale of the assimilation process. In the employed 
example, the first case is the same as that above without growth. The equilibrium ELF 
value approaches 0.34 after 15 rounds. For the growth cases, the final ELF value signifi- 
cantly drops to 0.11 and 0.03, respectively. Combined with the previously found results, 
one can see that growth can make the homogenization process possible although there 


might be rather high costs, e.g., due to a low level of education or poor infrastructure. 


Finally, one would expect immigration to be a major driver of a country's diversity. A 
very rough extension of the model above captures these dynamics. To account for this, a 
fourth group — the immigrant group — is introduced. Their relative population share at 
the beginning is zero. After that, the country experiences a steady immigration inflow of 
2% into this group.’® This leads to the effect that, even without covering the assimilation 
processes initially, one will find changes in the group setup. By introducing the immigrants 
group, the other relative group sizes decrease. All other dynamics discussed above remain 
comparable? As immigrants incur higher costs in learning the language of the host 


country, it is assumed that immigrants have a higher cost function, with 0 — 0.40. For all 


58The immigration rate is a net rate, i.e., it is designed as the immigration rate exceeding the country’s 
residents population growth rate. 
59There is one additional assumption to ensure that the dynamics remain constant. The reduction of 
the resident group proportion due to an increasing share of immigrants can, in some constellations, lead to 
a decreasing revenue split (pm — pg) for the smaller resident groups. This means for some individuals who 
already assimilated, the decision would no longer be viable in retrospect. In this case, no re-assimilation 
into one’s old group is assumed. 
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other groups, the previous assumption of 0 = 0.20 still applies.9? The differences from the 


base example of Figure 1.2 are shown in Figure 1.6. 
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Figure 1.6: Overall country ELF values for different immigration rates 


Besides the case without migration, two additional cases include a 2% and a 4% immigra- 
tion inflow. In the 2% case, the ELF values still decrease, showing a constant homoge- 
nization of the country, but the scale is significantly lower, and the ELF value eventually 
tends to 0.40. With a high immigration rate, however, the fragmentation is increasing, 
resulting in a more heterogeneous country with an ELF value of 0.67.9! 

The dynamic part of the model shows, in a simple manner, how the group constellation, 
the costs of assimilation, economic growth, and immigration affect a country's language 
heterogeneity. The important result to note is that there is no uniform dynamics evident 
leading to more homogeneous countries, but the initial heterogeneity might well persist, 


or even increase due to immigration. 


1.5 Extensions with international trade 


In a globalized world, considering a country in autarky will not tell the whole story. In this 
section, the additional dynamics through an extension of the model covering international 
trade is outlined, and a second country is included. This extension shall only roughly 


sketch the additional implications and does not take all the dynamics of the previous 


SOHigher social and cultural cohesion within the country's resident groups make it more costly for 
immigrants to assimilate. See, for example, Carvalho (2010). For empirical evidence of quite lethargic 
processes of assimilation by immigrants into their host country's culture, see Fernandez and Fogli (2005) and 
Fernandez (2010). Additionally, if the immigrant groups are easily excludable and the host country groups 
develop a racialist strategy, the costs of integration will further increase. These points are theoretically 
discussed in the models of Caselli and Coleman (2008) and Darity et al. (2006). On the contrary, a rising 
share of immigrants may reduce the costs of migration and it becomes feasible for more people to take 
this step, reducing the marginal needed ability level. Beine et al. (2011) confirms this empirically, showing 
that the average educational level is significantly negatively correlated with the size of existing diasporas. 

6l Besides immigration, differences in fertility rates across residents groups may equally lead to more 
heterogeneous countries. 
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section into account. However, the main implications are easily visible. The two countries 
A and B are of equal size, and within the countries the pay-off incentives are similar. The 
incentive constraints now only carry a country indicator. In line with Equation (1.4), they 


are now given through: 


s(y^): (p — 52) — 6o (0,a;) > 
s(yP)- (pi, — »2) — 07 (6,ai) > 


Now, all individuals can also engage in trade with the other country. Again, to facilitate 
trade, the individual would need to learn the language of the group he wants to trade with. 
Learning the language of another country bears the cost of c? (6,a;), and it is assumed 
that pointwise e > bjt holds and thus:9? 


ci (0,;) = gba (0,a;) , with g21 (1.7) 


This is indeed plausible, as learning a foreign language makes it likely that the two lan- 
guages are more distant than within a country. Additional costs might arise, because 
learning possibilities are more limited due to geographical distance. In order to trade with 
another country, trading costs of 7, with (0 € 7 € 1) accrue. For an individual of country 


A to engage in trade with country B, the following equation needs to hold: 


(a-r): sly”) pP sly") pg] - c2 (0,2) > 2- s(y^) pd 


e (a-r) sly”) ph, — s(y4) pg] ^2 (6.;) > 0 (1.8) 


Comparing now (1.4) and (1.8), one can see that if the countries’ structures are compara- 
ble, international trade is not pursued. If both countries have a comparable development 
structure (s(y^) ~ s(yF)) and the majority groups are the same size (på ~ pP), it is 
rather unlikely to see international trade emerge, as higher costs (cj > bit) and trade costs 
T apply. If the structures are different, international trade might emerge. 

Individuals of a certain group i in country A have three possibilities: remain in autarky, 
assimilate into the majority group of their own country, or engage in international trade 


with the other country. The sum of pay-offs over both stages in autarky is easily given by: 
2: s(y^)- pj (1.9) 


The two other options involve some learning costs but also the possibility of higher revenue. 


For these choices, the following constraints need to be satisfied: 


62 For c2 (0,a;), Oc} (0,a;)/8a; « 0 and ac (0,a;)/00 7 0 also applies. 
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Assimilation (national trade): 


s(y^): (pin + p2) — bf (0,a;) > 2: s(y^) p 
= sw") (pa —pa) —b2(0,a;) > 0 (1.10) 


International trade: 


 — |(1-7)-s(y®)- pi — s(y4)-pj] - (0, ai) > 0 (1.11) 


As a prerequisite for an alternative decision to autarky, the Equations (1.10) or (1.11) 
need to hold in order to be chosen by any individual of a minority group in country A. 
In addition, to make a decision in favor of international trade, Equation (1.11) needs to 
deliver a higher pay-off than (1.10) and thus: 


e (a-r): sly”) ph- sly) pa] - [(6— 1):02(6.;)| > 0 (1.12) 


For high costs, the autarky option delivers the highest expected revenue. Furthermore, 
the dynamics of the previous sections still apply. Lowering the costs D» (0,a;) or increasing 
s(y) makes assimilation more probable as a result. Decreasing transportation costs 7 (or 
closer integration in the international trade) increases the value of the international trade 
option and would thus also make it more probable. With further increased education 
and better integration into the global economy, the international trade option offers the 
highest pay-off. Having more opportunities to engage in international trade, a single group 
would have a greater possibility of finding an international trading pair that minimizes 
its assimilation costs. With more developed and integrated international trade, there are 
many incentives not to assimilate into the majority group of their own country, but to 


pursue international trade. 


This dynamic might even increase if the assumption of two countries of equal size is 
abandoned, and the second country is interpreted as the access to global trade. If this 
were the case, relatively smaller groups would still be big enough to be interesting for a 
small group in country A from an absolute size perspective. In the single country case, the 
assimilation decision was only relevant for all pg < pm. However, the international trade 
decision might even be relevant for a majority group, as long as pP is sufficiently larger 


than på. This is certainly the case in the global trade interpretation. Introducing the 
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second country thus clearly opens up the possibility for heterogenization dynamics that 
were not possible in the single country case. 

In terms of regional trade integration, it is even more relevant. One often finds the 
same ethnic or language group in two neighboring countries. In the case that this group 
is a majority in the one country but a minority in the other, the costs of assimilation 
tend to zero and only the transportation costs r apply. In such a setting, the minority 
would not assimilate into the majority group of its country but would pursue further trade 


integration with the brother group in the neighboring country. 


1.6 Conclusion 


Following the basic outline of Lazear (1999), his trade model is extended to better un- 
derstand the dynamics of changes in a country's heterogeneity. The basic model shows 
that depending on the group split, the assimilation costs, and economic growth, com- 
plete assimilation is not a necessary equilibrium and a certain level of heterogeneity could 
well exist. The highest incentive lies with the minority group(s) to assimilate into the 
majority group. The larger the difference between the minority and majority group, the 
faster assimilation will take place. A country that is fragmented into rather equally sized 
groups, however, will not experience major changes in its group setup, retaining its level 
of heterogeneity. 

In analyzing the changes in dynamics over some generations, the core dynamics of the 
basic model are confirmed. However, a close link to the changes in ELF values is also 
evident. Due to these analyses, one can see that countries with different group setups, 
but with the same level of ELF, might experience quite different changes, leading to 
different equilibrium ELF values. This clearly questions the general applicability of the 
ELF index. Although it measures fragmentation to some extent, it might not be the 
adequate measure of ethnicity in all analytical setups. With high costs of learning another 
anguage, for example due to a generally lower level of education in a country, or because 
of the fundamental differences between two languages, the heterogeneity of a country is 
retained. Thus, raising a country's level of education and reducing the assimilation costs 
or all ability levels should go hand in hand with a steady process of homogenization. This 
process is strengthened by higher economic growth. Thus, over its economic development 
path, one should find many more homogeneous countries. Besides the main drivers already 


ooked at, migration can, as expected, retard the process of homogenization and, above 


63 This additional option is probably the reason for the persistence of a high heterogeneity in Sub-Saharan 
Africa. In the nation building process, as part of the colonization of Africa, the colonial powers seldom drew 
borders along group lines, often bisecting the territory of ethnic groups with a border between two new 
countries. See Alesina et al. (2011) for an assessment of ‘artificial’ states and a new measure of how borders 
split ethnic territories between neighboring countries. Michalopoulos and Papaioannou (2011) analyze the 
effects of dividing these ethnic groups on their contemporary economic performance. For African case 
studies on how different relative group sizes affect the political salience of conflict between these groups, 
see Posner (2004b), and for resulting differences in public goods provision, see Miguel (2004). 
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a certain threshold, even lead to more heterogeneous countries. The important result to 
note is that there is no uniform dynamic in more homogeneous countries, but that the 
initial level of heterogeneity might well persist, or even increase due to immigration. 

A rough extension covering two countries, i.e., providing for the possibility of inter- 
national trade, gives some insights as to how changes in the dynamics of ethnic group 


splits might be altered. In this case, one might expect more languages and thus a more 


heterogeneous country. This is especially clear in the case where ethnic groups were split 
up, having to live in different countries and represent different shares of their respective 
country’s populations. The incentive of assimilation for the minority group fades here as 
it can pursue trade with their neighboring relatives with higher trading costs, but without 
the high costs of assimilation. 

The extended model outlined in this essay gives a better understanding of ethnicity’s 
dynamics and its endogenous formation. Furthermore, the results call for an empirical 
verification and for more profound analyses of specific case studies to better understand 
the dynamics of ethnicity, as it can hardly be assumed to be static anymore. This is 
especially important for the growing strand of empirical literature on the role of ethnicity. 
Using the ELF index for these analyses, one should be cautious in the interpretation of 


its role and discuss the results in the light of the dynamics found in this essay. 


Philipp Kolo - 978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


Philipp Kolo - 978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


Chapter 2 


Drivers of Ethnic Fragmentation 


2.1 Introduction 


“Every valley is still a little world that differs from neighboring world as Mercury does 
from Uranus” (Weber, 1976, p. 47). In this quote, Weber is not referring to a developing 
country in the heart of the African continent where ethnic heterogeneity is claimed to be 
at the roots of its growth problems.®* Instead, it is a citation of an economist describing 
France in the second half of the 19th century. Only 36 out of 89 départments were fully 
French-speaking, and “French was a foreign language for a substantial number of French- 
men, including almost half the children who would reach adulthood in the last quarter 
of the century” (Weber, 1976, p. 67). In addition to the language’s heterogeneity, Weber 
describes in great detail how diversity was persistent in every part of life, from cultural 
traits, measurement systems, currencies, and various beliefs which were in contrast to the 
officially proclaimed Christianity. Several decades later, in the middle of the 20th cen- 
tury, demographic estimates already showed the more common picture of France being 
the homogeneous ‘grande nation’.® 

This paves the way to investigate the dynamics of a country’s ethnic heterogeneity and 
to question the static nature in which most of the economic literature bases its analyses 


66 Although most authors admit that there is some 


on the role of ethnic heterogeneity. 
endogeneity involved, they do not pursue this fact further and proclaim that fragmentation, 


at least, is not changing over a short period of time.°” However, in a time where conflicts, 


64See the influential paper of Easterly and Levine (1997) about ‘Africa’s growth tragedy’. 

65 Héran et al. (2002) assess that less than 10% of parents did not speak French with their children in 
1950. 

66For a more detailed overview of the ways in which ethnic fragmentation is affecting the economic 
outcome of a country, via its influence on institutional and policy drivers of growth, see, for example, Alesina 
and La Ferrara (2005). For a good overview of ethnic fragmentation’s influence on conflict incidence, type 
and duration, see additionally Garcia-Montalvo and Reynal-Querol (2003). 

67 A rare exception is Fedderke et al. (2008), with a case study on South Africa. They employ changing 
values of racial fragmentation for each decade in their analysis on its role in economic growth. Evers 
et al. (2010) offer a rough overview of a newly developed index for Malaysia, intended to better analyze 
Malaysia’s ethnically heterogeneous society and to track its changes. 
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migration and globalized trade are shaping countries and their populations, shouldn’t one 


be able to observe shifts in a country’s ethnic setup over several decades? 


In contrast to this literature, some recent publications try to shed some more light 
on what roots ethnic heterogeneity might have and why it developed so differently across 
the globe. In contrast to the previous essay, where the dynamics of ethnic fragmentation 
were modeled theoretically, the focus of this essay is to empirically test what drives these 
dynamics. Ahlerup and Olsson (2007) analyze the influence of human settlement, finding 
that the duration of uninterrupted settlement leaves more time to diverge into different 
groups, leading to an increased fragmentation. The existence of modern states and their 
institutions lowered a country’s fractionalization.°® Additionally, policies might directly 
or indirectly promote ‘assimilation’. 

Michalopoulos (2011) bases his article on Darwin’s theory of evolution. He argues that 
various geographical conditions are “the ultimate cause of the emergence and persistence 
of ethnic diversity” (Michalopoulos, 2011, p. 2). These different settings in turn lead to 
the emergence of different species, adapted to their specific niche, which is also true of the 
modern human being. 

Whereas both Michalopoulos (2011) and Ahlerup and Olsson (2007) explore rather 
long-term historical and geographical determinants of ethnic fragmentation, Campos and 
Kuzeyev (2007) analyze changes in heterogeneity in the former Soviet republics after the 
fall of the Iron Curtain. Their approach thus comes closest to the intention of this essay. 
They show that over the decade that followed 1989, ethnic fractionalization decreased 
in most countries, language heterogeneity did not change significantly, and religious het- 
erogeneity demonstrated a slight increase.® Unfortunately, Campos and Kuzeyev (2007) 
conclude with these findings and stop short of empirically analyzing the reasons for the 


adaptations.” 


Using data for the 1960’s and 1980’s, this essay supports the above findings that a ‘base- 
level’ of ethnic fragmentation evolved due to a set of geographical and historical variables. 
Furthermore, it offers a new interpretation of colonization’s impact on shaping a country’s 
ethnic fragmentation. The approach the colonial powers followed in their pursuit plays 
an important role. The main finding of this essay is that ethnic heterogeneity is changing 
over a rather short period of twenty years. Migration is the most obvious factor in a more 


integrated and globalized world, which is confirmed by this study. However, it also shows 


68See also Ranis (2011), who argues that kinship relationships are a mere compensation for nom existent 
official social security networks. 

For a discussion of the interrelations between various forms of fractionalization and various social, 
political and institutional dimensions in a case study of South Africa, see Fedderke and Luiz (2007). 

70n an independent, but simultaneous contribution, Green (2011) follows an approach comparable to 
the one followed here. He uses the same data set, but follows a different empirical strategy and does not 
differentiate between drivers for a ‘base-level’ of fragmentation and more short term influential factors. The 
main driver for a country’s homogenization in Green (2011), urbanization, is found to be equally influential 
here. However, the results of this essay show that it is not only one driver, but additional important ones 
that have an influence, which are either not identified or deemed less influential by Green (2011). 
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that urbanization and education in particular play a significant, and even more important 
role. 

The remainder of this chapter is organized as follows. In section 2.2, ethnicity is 
briefly framed and the key views on its dynamics are introduced. Section 2.3 structures 
and discusses the various drivers that might be responsible for changes in a country's 
ethnic setup. Section 2.4 outlines the empirical strategy and discusses the data sources 
used, their coverage, limitations and first insights into the descriptive statistics. Section 
2.5 then discusses the empirical significance of the drivers for a wide range of countries. 
Finally, section 2.6 summarizes the key findings, concludes and gives an outlook for further 


research. 


2.2 Framing ethnicity 


Ethnicity is sometimes described as a “rather vague and amorphous concept" (Alesina 
et al., 2003, p. 160). Despite the lack of a clear cut definition, economic literature focuses 
mainly on three characteristics when discussing a country's cultural fragmentation: its 
ethnic, language and religious fragmentation. The combination of ethnic and language 
characteristics leads to the widely used taxonomy of ethno-linguistic groups.”! Based on 
relative group sizes defined along these characteristics, Taylor and Hudson (1972) built the 
ethno-linguistic fractionalization index (ELF) as a measure of a country's ethnic setup. 
Although other forms of operationalization have recently emerged to go about answering 
specific questions, the ELF is still the most commonly used measure whenever ethnicity is 
included in economic analyses. The ELF is calculated based on a Herfindahl-Hirschman 


concentration index: 
K 


ELF=1-) pi, i—1l,..K (2.1) 
i=1 

where p; are the relative group sizes of the K groups in a given country. The measure 
ranges between zero (only one group and thus complete homogeneity) and one (complete 
heterogeneity). It reflects the probability that two randomly selected individuals from a 
population come from different groups and generally increases with the number of groups. 
However, to define ethnic heterogeneity and its measurement does not yet explain 
what led to the emergence of ethnic groups and what shaped, or constantly shapes ethnic 


identities and their group identification. Three schools of thought have emerged to provide 


TiSee, for example, Alesina et al. (2003), and Fearon (2003), who build their measures based on this 
combined taxonomy. For more details on language groups and the mutual differences, see Lewis (2009) 
and Fearon (2003). For some specific analysis on the role of religion, see, for example, Guiso et al. (2009) 
or Barro and McCleary (2003) and Garcia-Montalvo and Reynal-Querol (2003) for the role of religious 
polarization. 

T2The seminal articles of Mauro (1995), Easterly and Levine (1997), Collier (1998, 2001), Alesina et al. 
(2003) and Alesina and La Ferrara (2005) all rely on the ELF index. For details on other measures, see 
Garcia-Montalvo and Reynal-Querol (2003, 2005a, 2008) for an index of polarization, Posner (2004a) on 
his restricted index of politically relevant ethnic groups, and Fearon (2003) for the idea of ethnic distance 
that is further explored in Chapter 3. 
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an answer to these questions. The primordial, the instrumentalist and the constructivist 
approach differ in their interpretation of origin, persistence and shaping forces of ethnic 
groups. Fenton (2010) combines the three concepts and aspects very well, offering a 
comprehensible argumentation for the main argument of this essay, i.e., ethnic boundaries 
are indeed subject to change. In his view ethnicities are “grounded as well as constructed. 
Ethnic identities take shape around real, shared material experience, shared social space, 
commonalties of socialization and communities of language and culture. Simultaneously, 
these identities have a public presence; they are socially defined in a series of presentations 
(...) by ethnic group members and non-members alike” (Fenton, 2010, p. 201). Ethnicity, 
as such, therefore contains some irrevocable core characteristics that represent the most 
essential characteristics of a group, whereas other parts of the ethnic identity might be 
subject to change. Thus, identification of ethnic groups can either be driven by self 
identification of its members around a common marker, or at least partly driven and 
shaped by the political or societal arena in which their identities are activated, and in 
which the ethnic group identification is formed. In analyzing drivers of changes in a 
country’s ethnic setup, all three approaches subsequently deliver potential explanations 
and influential factors. 

Some theoretical frameworks and mathematical models offer additional motivation for 
the dynamics of changing ethnic boundaries. Constant and Zimmermann (2007) discuss, 
in a simple framework, the main strategies of immigrants with respect to their ethnic 
heritage, following either an assimilation, integration, marginalization or separation strat- 
egy. Depending on the strategy chosen, different effects on the ethnic composition in 
the destination country would emerge. Bodenhorn and Ruebeck (2003) model and ana- 
lyze the emergence of mixed ethnic group in the United States in order to improve their 
economic position. Darity et al. (2006) use an evolutionary game theory model to show 
different ‘acculturation’ outcomes, and Caselli and Coleman (2008) analyze the decision to 
change group membership within a model of ethnic conflict. Ahlerup and Olsson (2007) 
build their model on kinship-based social organizations providing public goods.” Finally, 
Lazear (1999) models the assimilation processes of language groups in order to sustain 
or ameliorate trade. Chapter 1 extended this approach and covered the main dynamics 
this essay tries to prove empirically. It balances the gains of increased trade possibilities 
from learning a new language with the costs of doing so. The costs are strongly influenced 
by the proximity of the two languages and the infrastructure available, both for learning 
as well as trading. Trading gains in turn are defined by the size of trade partners, i.e., 
the size of the respective language groups. The extended model shows that with rising 
development, a continuous process of assimilation into the majority group is expected. 
Increasing education lowers the costs of learning and more individuals would decide in 
favor of assimilation. Higher transportation costs (or less integration or infrastructure) 


decrease the value of the trade option and would thus make assimilation less probable. 


"The aforementioned models were already described in more detail in Chapter 1. 
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Migration can impede the homogenizing path of a developing country and might even 
increase a country’s heterogeneity in some cases. Thus, the model of Chapter 1 gives some 
initial points of reference for the subsequent discussion of potential drivers for a changing 


ethnic setup. 


2.3 Potential drivers of ethnic fragmentation changes 


Ethnic boundaries that are based on tradition, ancestry and conveyed habits, are certainly 
not subject to instant changes. However, the environment in which a generation is are 
raised, be it economically, socially or educationally, should leave its mark, thus leading 
to a changed ethnic identification; especially in an increasingly globalized world. A key 
difference between the prospective drivers of change might be their time dimension. The 
geographic properties of a country are, for all intents and purposes, fixed.” Other factors 
can change rather quickly and are susceptible to political influence. Depending on the ease 
of change, the variables can be categorized into two groups: evolutionary and historical 


factors, and socioeconomic and policy factors. 


2.3.1 Evolutionary and historical factors 


Location and geographical conditions One of the most basic location characteris- 
tics of a country is its latitude. Michalopoulos (2011) points to the fact that biodiversity 
decreases with an increasing distance from the equator. The high amount of biodiversity 
around the equatorial region is rooted to its tropical climate, the associated habitat het- 
erogeneity, and its higher pathogen load (Cashdan, 2001). The lack of climatic variability 
in tropical areas leads to specialization in a very specific environment or niche. Areas 
with high climatic variability (e.g., hot summer, cold winter) require a more generalized 
approach to manage this variability, subsequently leading to a lower species variation. Ad- 
ditionally, a country that has a large proportion of mountainous areas offers more niches 
and at the same time makes an exchange between these areas much more difficult. For 
both reasons, one would expect more mountainous countries to be more diverse. Large 
countries that cover a huge area should encompass more bio-geographic niches and should 


thus demonstrate greater heterogeneity.” 


Human development The historical duration of uninterrupted human settlement for 
millennia has allowed more time for humans to diverge into different groups. Ahlerup and 
Olsson (2007) rebuild the way in which the modern human migrated from its birthplace 


in East Africa to all other parts of the world. In doing so, the development follows a 


"The access to remote areas can be alleviated, but this is a policy decision regarding infrastructure, 
rather than a change in geographical conditions per se. 

75 Ashraf and Galor (2007) model explicitly how geography affects cultural assimilation and cultural 
diffusion. They conclude that these two modes of influence are responsible for different timing and speed 
of industrialization, which affects the economic performance of nations today. 
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constant process of genetic fractionalization. Geographical conditions are an important 
driver in the emergence of different human groups. Whereas the vast amount of time since 
the emergence of the modern human has already led to diversification solely based on 
genetic mutations, geographical conditions help to shape and maintain heterogeneity in 
various locations.’® Ahlerup and Olsson (2007) direct attention to Papua New Guinea as 
an example of how both aspects jointly affect ethnic heterogeneity. Its special geography 
spans a wide array of bio-geographic niches, and with humans known to have lived there 
since 65,000 years ago, this has led to many isolated and distinct ethnic and language 
groups. Some 860 indigenous languages, spoken within a total population of only around 


four million, are still reported today.” 


State history and colonization Institutions can play a decisive role in homogenizing 
countries. Well functioning institutions that include codified laws, security and military 
protection have rendered ethnic and cultural forms of interaction less important and this 
should have led to an assimilation process into the major group.” For Olsson and Hibbs 
(2005), the transformation from a hunter-gatherer economy to sedentary agricultural pro- 
duction was one of the most important events in shaping societies.”? This transition led to 
a very basic set of institutions. A subsequent increase in productivity promoted the devel- 
opment of a non-producing class. Freeing this class from production obligations left room 
for the development and organization of knowledge, leading to the expansion of science, 
technology, and state formation. The time since the agricultural transition is therefore 
assumed to be a factor influencing civilizations and their respective heterogeneity.5? 

In many developing countries, the arrival of colonizers had a lasting influence on exist- 
ing structures and was a significant factor in creating and shaping countries and societies. 
Colonizers tried to introduce their legal and political systems, as well as often forcing 
their own language on occupied countries. From a language point of view, Latin America 
displays a strong homogeneity as Spanish was widely adopted. The same is true of many 
French-speaking countries in Africa. The identity of the colonizer and the time span of 
colonization might be crucial factors for changes in ethnic boundaries. Depending on the 
interest of the colonial power, they either pursued the ‘divide-and-rule’ approach and just 
exploited the country without any long term interest (mainly in Africa), or actually es- 
tablished institutions to sustain a long term development and settlements (e.g., Canada 


or Australia). Acemoglu et al. (2001) attribute these two contrasting approaches to the 


6For an additional discussion of the similarity between biocultural heterogeneity and ethnic fragmenta- 
tion, see Loh and Harmon (2005) and Evers et al. (2010). 

The 860 languages represent over one tenth of the world's total (Lewis, 2009). 

78See, for example, Greif (1993) on an example of ancient trade relationships in the Maghreb region. 
For a broader overview, see Rauch (2001). 

Tn their argumentation, Olsson and Hibbs (2005) follow Diamond (1997) who roots the Neolithic 
Revolution in different biogeographical endowments, leading to differences in resource surpluses. 

80 Ahlerup and Olsson (2007) explore how experiences with a modern state over the last hundred years 
significantly reduced fragmentation. Yet they admit that causality in this aspect is not clear, and more 
homogeneous countries might have developed a modern state more easily and thus earlier in history. 
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differences in living conditions the colonizer came across at that time. They measure 
these conditions as the mortality rate amongst the Europeans arriving in their respective 
colonies. In countries with higher mortality rates, the colonizers did not want to create 
lasting structures and institutions intended for long term settlements. A more extrac- 
tionary approach specifically exploited differences between groups, deepened them and 
turned the groups against each other. This was pursued by the Belgians in Rwanda with 
the Hutu-Tutsi split, which was still salient in the twentieth century.°! In countries with 
higher mortality rates, which were subsequently exploited and experienced lower levels of 


institutional development, one might find a higher degree of ethnic fragmentation. 


2.3.2 Socioeconomic and policy factors 


Demographic factors The global international migrant stock rose from 72 million to 
213 million people between 1960 and 2010 (World Bank, 2011). Immigration is seen as the 
primary reason responsible for increasing heterogeneity with respect to ancestry, ethnic 
origins, and religions, bringing long-term changes to the population make-up (Coleman, 
2009). 

Schiiler and Weisbrod (2010) analyze whether the effect of ethnic heterogeneity on 
economic performance changes for countries with a higher stock of immigrants. They con- 
clude that migrants increase trade as they import information about their home country, 
thereby reducing transaction costs and simultaneously increasing trade due to their prefer- 
ences for home country products. However, they do not analyze what impact immigration 
has on the level of heterogeneity in a country.®? 

Fertility rates and population growth are affected by a wide range of factors. Ulti- 
mately, not only a woman’s personal experience but also her heritage plays a decisive 
role. Different preferences in fertility rates between a country’s historic population and 
immigrant groups might be important. Most host countries (mainly developed countries) 
experienced their fertility transition earlier than most less developed countries (where 
many immigrants originate from), significantly lowering the number of births per woman. 
This should have a significant impact on destination countries.°* 

A rising population density will mainly affect very small countries. The growth of 
metropolitan regions is more susceptible to changes in a broader set of countries. The 
population density in urban areas might even increase when the country density remains 
constant due to high rural-urban migration flows. In her work on biodiversity, Cashdan 


(2001) showed that an increased density of species leads to a higher degree of specialization 


81 For a broader discussion of a ‘divide-and-rule’ strategy as a principle of mere exploitation, see Ahlerup 
and Olsson (2007). For the Rwandan case, see also Caselli and Coleman (2008), who discuss their theo- 
retical model in light of this conflict. 

82 Especially their heterogeneity measure does not change even for high immigration countries. 

83Fernandez and Fogli (2009) find that heritage-induced fertility is a significant and persistent factor 
within second generation immigrant mothers in the United States. 

*Hispanic and Asian ‘minority’ groups in the United States are projected to account for around 36% of 
the total population by 2050 (Coleman, 2009). 
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in a smaller area and thus eventually to a higher level of heterogeneity. Urban areas are 
an agglomeration of people all struggling over limited resources. Thus, coordination along 
ethnic ties to better sustain economic or social development can be expected. However, 
as the newly arrived population needs to interact with the existing masses, an integration 
into this mainstream is also expectable. One can maintain the argument that urbaniza- 
tion erodes cultural foundations and replaces ethnic ties with more interest-based liaisons 
(Bates, 2006). This could have an effect on the ethnic differences between groups. Ethnic 
borders become less pronounced, leading to a more homogeneous civilization. The impact 


of urbanization is thus, a priori, not clear. 


Conflicts A wide set of literature tries to link an increased incidence of conflict with 
higher ethnic fragmentation.?? The reverse causal chain has yet to be addressed in em- 


56 What remains un- 


pirical papers, but some theoretical models capture this dynamic. 
questioned is that the various forms of conflict have a significant impact on a country's 
population. Presumably, death from prosecution or combat has a direct impact on the 
population, whereas indirectly it is affected by refugee-induced migration. This is true in 
both the country where the conflict is rooted and neighboring countries. The violent con- 
struction of ethnic identities, ethnic cleansing and genocides are the most brutal form of 
this. In line with the constructivist view, additionally, the question arises, whether ethnic 
identities arise or are shaped upon the onset of ethnic conflicts. Elites might agitate their 
peers and strategically use potentially salient ethnic divisions for their ambitions. Fearon 
and Laitin (2000) analyze a wide range of case studies concluding that elites systematically 


construct ethnic identities in order to strengthen. maintain or seek power.5? 


Economic factors and trade There is a growing literature on factors benefiting the 
economic growth of a country, including various measures of institutions, financial indi- 
cators, trade, education or infrastructure. Thus, it would be obvious to include GDP 
figures in the regressions. However, it is hard to see why the economic development level, 
per se should have altering effects on the ethnic heterogeneity of a country, if this is not 
the case with various variables highly linked to it. Thus, to better elaborate upon which 
of these variables affect heterogeneity, a set of variables highly linked to GDP per capita 


measures is included. 


85The first to analyze the effect of ethnic fragmentation on conflicts were Collier and Hoeffler (1998). 
Fearon and Laitin (1999) then analyzed the question with a focus on minority groups, Collier (1998) with 
a focus on democratic institutions, and Fearon (2003) with a more general approach. 

86See, for example, Caselli and Coleman (2008) or Darity et al. (2006) and more generally Ahlerup and 
Olsson (2007). 

*"Fearon and Laitin (2000) also give a general overview of the theory on social construction of ethnic 
identities. 

88 An exemplary selection of papers analyzing economic growth factors that also deal with ethnic frag- 
mentation include Easterly and Levine (1997), Mauro (1995), Alesina and La Ferrara (2005), Bellini et al. 
(2009), Collier (2000) and Sachs (2001). 
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Olsson and Hibbs (2005) discuss the structural changes within an economy over its 
development path. A different economic structure could be more susceptible to different 
values regarding ethnic heterogeneity. Gellner (1983) reasons that the industrial revolution 
and the accompanying higher division of stages in production led to a need for higher 
homogenization. To face the new division of labor and to efficiently work together, there 
was a need for a certain level of assimilation or homogeneity. 

Assimilation does not necessarily take place within one economy only, but can also 
have the effect of a mutual rapprochement between two different countries. For Janeba 
(2004), imported Western products are responsible for crowding out locally manufactured 
goods and might even marginalize local culture. In general, trade makes a higher variety 
of (foreign) products available and normally also reduces the price of these goods. The 
increased access and lower relative price decrease the overall cost of non-conformity with 
the individual’s own culture and paves the way for a more globalized, or generalized, 


culture.9? In some constellations of his model, this might even outweigh the gain of trade. 


Institutions and policy factors Institutions in general and their underlying ideology 
might play an important role. The development of state structures, codified law, govern- 
ing institutions and common military protection have changed the way we live together. 
Ethnic identity might always be a point of tension in a nation state promoting cultural 
similarity and integration. The relationship between ethnic fragmentation, the emergence 
of institutions and vice versa is not a priori clear. Institutions can grant equality, human 
rights and freedom to pursue cultural expressions. They can also be used as an excessive 
form of nationalism, excluding culturally deviant citizens with various forms of pressure, 
or even brutality.? This kind of uniforming policy can be present in all forms of state 
activities, always with the intention of considerably altering the ethnic composition into a 
more nationalistic, homogeneous country. In forming a French identity, as outlined in the 
introduction, the modus operandi was rather peaceful. In the last century, however, some 
cases exhibited unimaginable brutality. 

Linked to institutions is the inevitable question of the role of democracy. Both Alesina 
and La Ferrara (2005), and Collier (1998) show that more democratic regimes moderate the 
potentially detrimental effect of ethnic fractionalization on economic development. This 
could indicate a more tolerant environment in democratic countries in which more diverse 
views are accepted. Campos and Kuzeyev (2007) hold the more tolerant environment of 
democratization after the fall of the Iron Curtain in the former Soviet republics responsible 
for an increased religious heterogeneity. However, this might have been a special case, as 
religious activity was especially disregarded under the communist regime. More autocratic 


or dictatorial regimes that are built around a very nationalistic ideology might display 


*?Dreher (2006), for example, proxies social globalization inter alia with the number of McDonald's 
restaurants. 
°°For a discussion of the blurred transition between ethnicity and nationalism, see Eriksen (1991). 
Philipp Kolo - 978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


42 Chapter 2. Drivers of Ethnic Fragmentation 


significantly lower heterogeneity. Again, the role of democratic regimes and the direction 
of causality is not clear.2! However, there is some indication that this kind of political 
regime at least leaves more room for cultural activity, which might be represented in a 


more diverse religious or ethnic identification. 


Education plays a key role in the development of a country (Barro, 1999; Knack and 
Keefer, 1997) and its democratization (Akdede, 2010; Barro, 1999). Bolt and Bezemer 
(2009) describe well the different effects education might have. In a general interpreta- 
tion, education increases one’s human capital. Being endowed with higher human capital, 
one’s social and economic vulnerability declines. Less vulnerable groups are less reliant 
on ethnic differentiation or identification to pursue their (economic) activities. It also in- 
creases tolerance and leads to more rational decisions. Both effects back up the argument 
that ethnic identification becomes less important with an increasing level of education. 
Transporting a common history and culture can lead to a better mutual understanding 
but can also be used as a form of exerting an influence over young citizens. In the context 
of this Chapter, education is also interpreted as a strong expression of state power with 
the “purpose of cultural repression” (Bolt and Bezemer, 2009, p. 28). For minorities, ed- 
ucation often includes language education, as they might have been raised in their native 
language.”? It seems that early education has the most significant effects, as it is the first 
time in many countries that a young citizen is confronted with the influence of state insti- 
tutions. Thus, the shift from no schooling to primary schooling is thus probably the most 
important one. A country with a higher primary enrollment rate or educational coverage 
might be more homogeneous as a result. The impact and role of higher (secondary or 


tertiary) education is, however, less obvious.?? 


Despite geographical hurdles, modern forms of infrastructure and communication make 
an exchange between remote areas possible. Roads, on which goods and services may 
travel, are crucial to starting business with the periphery. Infrastructure can counter- 
balance geographical disadvantages by enabling participation in national or international 
trade.°4 Accordingly, Cashdan (2001) shows that ethnic fragmentation is indeed lower 


where land and water transportation are more efficient.9^ 


91 Collier (1998), for example, discusses, how more democratic regimes might only emerge (or at least 
more easily) in countries where ethnic differences are less problematic. 

92 Turkey, for example, still partly prohibits the native Kurdish language and promotes an education 
system exclusively in Turkish. Aimed at marginalizing this culture and to repress its minorities, it still uses 
discriminatory language in school books (European Commission, 2006). Aspachs-Bracons et al. (2007b) 
show in a case study for Catalonia that the identification with the Catalonian identity significantly increased 
after the introduction of Catalan as the compulsory educational language in schools. This effect was found 
irrespective of the parents' origin (Spanish versus Catalan). 

935Barro (1999) also finds differences in terms of explanatory power of the various education levels on 
democratization. Whereas average years of attainment and the gender gap at the primary level have high 
explanatory power, secondary and higher levels of education do not. 

9^ For a detailed survey of infrastructure and their impact on trade flows, see Limao and Venables (2001). 

95For recent studies on how the ‘modernization’ theory of nationalism (economic, infrastructural and 
political development) affects ethnic identification in Africa, see Eifert et al. (2007) and Robinson (2009). 
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2.4 Empirical strategy and data 


In order to relate to the existing literature, some of the key results of Ahlerup and Ols- 
son (2007) and Michalopoulos (2011) regarding a ‘base-level’ of ethnic heterogeneity are 
replicated. This analysis takes on the effects of the evolutionary and historical factors 
discussed in section 2.3, which will stay constant or not change over long periods of time. 


The corresponding ordinary least square (OLS) regressions are for: 
ELF, = Bo + Bi: Zi €i (2.2) 


where ELF; is the ELF level in country i. Z; is a vector of the various independent vari- 
ables, and e; is a random error term. The model uses heteroskedasticity robust estimators. 

Having analyzed the static variables influencing ELF levels, some new insights in to 
how fragmentation is changing over a rather short period is the focus of the second step. 
An adjusted growth model, taking into account level data that do not change over the 
period, and the relevant variables that should be responsible for the change of ELF levels 


is used here. The linear regression model is specified as follows: 
AELF; = bo + Bi: Zit i: AXi tei (2.3) 


where AELF; is the change of the ELF value of country i between the two points in 
time. Vector Z; contains level data that are static (e.g., country size). These factors are 
controlled for, as the timing or magnitude of changes could be influenced by their presence. 
In a very mountainous country, ethnic fragmentation might be much more stable than in 
a small country that does not have any geographical barriers. AX; instead contains the 
relevant changes in the socioeconomic and policy variables over the period covered. e; is 
a random error term, and again, the model uses heteroskedasticity robust estimators. 
The key question for the empirical operationalization is which source should be applied 
for the ELF values. Defining ethnic groups is very much liable to the subjective decision 
of the respective authors. Combining two sources over different points in time is highly 
difficult. A distinction between differences in definitions and real changes in a country's 
ethnic setup is all but impossible. The only data source that offers ethnic heterogeneity 
data on two points in time is the Atlas Narodov Mira (ANM), compiled by Russian ethno- 
graphers (Bruk, 1964; Bruk and Puckov, 1986). Although only the first edition of the 
Atlas Narodov Mira (Bruk, 1964) is widely used in the literature, there is a second edition 
from the mid-1980s (Bruk and Puckov, 1986).99 Some later critique centered around the 


ANM's bias towards a higher linguistic than ethnic split of groups. This underestimates 


96 As both are only published in Russian, this essay relies on Roeder (2001), who calculated and published 
ELF values based on these two editions. Roeder (2001) also calculates ELF values in three different ways, 
depending on the aggregation levels of sub-groups reported in the original data. Following the approach 
of Alesina et al. (2003), this analysis is based on the most disaggregated values that use all sub-groups 
reported. 
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the fractionalization in regions like Latin America, where Spanish is widely spoken by 
minority populations. More important for this essay is that the definition of the groups 
follows the same lines in both points in time and less whether the chosen group charac- 
terization is the correct one." Despite the critique on the ANM data, when comparing 
them with the two main alternatives, Alesina et al. (2003) and Fearon (2003) yield high 


correlations, as displayed in Table 2. 1.95 


| ANM ’61 ANM’85 Alesina Fearon 


ANM ’61 1 
ANM '85 0.949 1 
Alesina 0.843 0.786 1 
Fearon 0.814 0.839 0.858 1 


Table 2.1: Spearman rank correlations of main ELF indices 


Additionally, one might argue that the data have been assembled under the auspices of the 
Soviet Union, with a significant bias between Eastern and Western countries. Taylor and 
Hudson (1972) tested for this potential problem but did not find anything to support this 
argument.?? Finally, Weidmann et al. (2010) conclude that the ANM data “is complete 
and carefully researched, it relies on a uniform group list that is valid across state borders” 
(Weidmann et al., 2010, p. 5). The last point is probably the most important for this essay. 

Based on the sources used to calculate the ELF values, Roeder (2001) reports the data 
to be for the years 1961 and 1985. As yearly data on most of the covariates used to explain 
ethnic heterogeneity and its trends is scarcely available, average values for 1960-65 (the 
first point in time) and for 1975-80 (the second) are used.!?? An important reason for 
taking the average of several years, instead of single ones, is to avoid, or at least reduce, 
the impact of cyclical deviations. For the later time span, one could alternatively use 
1980-85 instead of 1975-80. The period from 1975-80 is preferable for two reasons. First, 
if ethnic fragmentation adjusts in reaction to policy changes, as is argued in this essay, 
it needs time to adapt and will not change immediately. Taking a lag of five years gives 
some room for these adjustments to occur.!?! Second, with time having elapsed between 
changes in policy variables and the ELF adaptations, this limits the suspicions of reverse 


causality that ELF changes are responsible for policy adjustments. 


9" For more information on the data offered in the Atlas Narodov Mira and a high level comparison to 
other sources, see Weidmann et al. (2010). 

98For their ELF indices both combine different sources, mainly the CIA Factbook (CIA, 2011) and the 
Encyclopædia Britannica (2007). Whereas Alesina et al. (2003) intend to always select the most granular 
source, Fearon (2003) limits the data on groups that at least constitute 196 of a country's population. 

°°The same conclusion is drawn by Ginsburgh and Weber (2011) in comparing the ANM data with other 
sources of this time, namely Roberts (1962) and Muller (1964). The data of both sources were also found 
in Roeder (2001). 

100n the early 1960s, data were often only available in five-year spans. Taking six-year averages increases 
the data availability for many countries for the first point in time. 

101 Analyzing the adjustment times between policy changes and ELF value changes, which might differ 
considerably between variables, exhibits an interesting area for future research. 
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ANM 1961 ANM 1985 Delta ('85-'61) 
Region Obs. Mean  Std.Dev. | Obs. Mean  Std.Dev. | Obs. Mean  Std.Dev. 
World 138 0.463 0.278 168 0.461 0.272 138 0.006 0.086 
Asia 22 0.483 0.295 27 0.467 0.306 22 -0.035 0.053 
E. Europe 5 0.138 0.094 26 0.371 0.207 5 -0.029 0.038 
L. America 25 0.446 0.194 26 0.443 0.213 25 0.012 0.061 
MENA 19 0.318 0.165 20 0.342 0.222 19 0.040 0.177 
SSA 45 0.674 0.226 46 0.663 0.235 45 -0.011 0.037 
W. Count. 22 0.231 0.210 23 0.273 0.227 22 0.050 — 0.076 


Table 2.2: Summary statistics of Atlas Narodov Mira data for 1961, 1985 and its change between 
1961 and 1985 


Roeder (2001) reports data for 138 countries at both points in time based on the 
respective edition of the Atlas Narodov Mira.!°? Table 2.2 displays the distribution of 
ELF values across regions for both years. The highest median level is found, as expected, 
in Sub-Saharan Africa (SSA) and the lowest in Western countries! This picture is 
consistent in both years. The same is true of the intermediate ELF values for Asia, Latin 
America and the Middle East and North Africa (MENA). The huge change in ELF values 
in Eastern Europe between the 1961 values and those of 1985 comes entirely from an 


increase in the number of countries under observation, from five in 1961 to 26 in 1985.104 


Regions that became more homogeneous (a decreasing ELF value) display negative 
values, whereas regions that became more heterogeneous (an increasing ELF value) show 
positive values. Although the median country per region did not change much, all bar 
33 countries report a change in their respective ELF value.!® The biggest changes were 
experienced in the MENA region, where countries moved significantly in both directions. 
Nevertheless, some tendencies of regional drift can be noted. Whereas Asia experienced a 
homogenization, Latin America and the Western countries displayed some heterogeniza- 


tion. On average, Sub-Saharan Africa did not experience much variation over the 20 years 


in question. 


10277 total, data are reported for 151 countries for the two points in time. However, some (former) 
countries where no additional data were available and countries that changed considerably over the time 
due to secession (e.g., Pakistan/Bangladesh) or union (e.g., Vietnam) were excluded. 

103 Besides the European Countries, this includes developed nations like Australia, Canada, Japan, New 
Zealand and the United States. Categorization is taken from Fearon (2003). 

104The countries for which values are available for both points in time are Albania, Bulgaria, Hungary, 
Poland and Romania. Their mean ANM value for 1985 is 0.109. 

105 This includes countries that only exhibited a marginal change of + 0.01 or less. 
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2.5 Results 


2.5.1 Influential factors on a ‘base-level’ of ethnic fragmentation 


The regressions from Table 2.3 are based on Equation (2.2) and include the major ge- 
ographical variables already discussed. Latitude reflects the distance from the equator, 
Altitude measures the altitude variation that is found within a country, and Area is its 


surface areal, 


The further away a country is located from the equator, the more one 
would expect decreasing biodiversity and, in turn, lower ethnic fragmentation. Latitude 
has the expected negative sign and is highly significant (at the 1% level). The location of 
Sweden compared to Uganda would explain nearly half of their ELF difference for exam- 
ple.107 
di 


different species and ethnic groups, also acting as a barrier which ensures their sustainabil- 


Larger and more mountainous countries have a higher probability of encompassing 


Iz 


erent habitats. This allows for more solitary areas that facilitate the development of 


ity. Altitude does not have a significant impact at conventional levels in the first regression 


but Area again exhibits a highly significant, positive impact on a country's heterogeneity 
at the 196 level. 

The fourth variable included in the first regression is Agritime. It captures the time 
elapsed since the transition from a hunter-gatherer economy to agricultural production, 
covering the historical development of institutions. The earliest countries transitioned 
around 8500 B.C. and the latest only around 1600 A.D.!0® Countries that made the tran- 
sition earlier in time should then exhibit a lower level of fragmentation as they had more 
time to develop into more advanced civilizations. Indeed, Agritime displays a negative 
sign that is significant at the 1096 level. The different transition times between the first 
and the last countries (approx. 10,000 years) lead to 0.15 lower ELF values. 

In regression (2), another variable used by Ahlerup and Olsson (2007) is included. 
'The experience of a modern state captures how many years a country had power over 
its territory in the time between 1800 and 1950. It has a comparable interpretation 
to Agritime, but captures to some extent the final result, or how well early civilizations 
developed into modern civilizations. Therefore, it comes as no surprise that both variables 


point in the same direction.!09 


Regression (3) controls for more specific geographical characteristics, including a Trop- 
ics variable and regional dummies. The Tropics variable measures the percentage of a 
country's total area classified as being exposed to a tropical climate. As expected, one 


finds a positive and significant (1096) correlation between tropical climate and fragmenta- 


106For a detailed description of the variables and their sources, see Table B.1 of Appendix B.1 

107 Latitude additionally functions as a proxy for migration distance or genetic fission. As the birthplace 
of the modern human is supposed to be near today's Ethiopia, the distance from the equator also partly 
covers the idea of human origin (Ahlerup and Olsson, 2007). 

108-The first were Israel, Jordan, Lebanon and the Syrian Arab Republic, whereas Mauritius and Australia 
were the last. 

109Both variables show a rather low correlation of 0.11. 
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(1) (2) (3) (4) 
ANM 85 ANM’85 ANM’85 ANM ’85 
Latitude -0.611***  -0.460*** -0.378 -0.696*** 
(-5.65) (-4.12) (-1.44) (-4.96) 
Altitude 0.066 0.143** 0.137 *** 0.191** 
(1.30) (2.35) (2.68) (2.42) 
Ln (Area) 0.024*** — 0.041*** 0.023* 0.034* 
(2.66) (4.16) (1.92) (1.82) 
Agritime -0.015* -0.021** -0.006 -0.009 
(1.76) (-2.56) (-0.49) (-0.73) 
Modern -0.027*** -0.017* 
(-4.93) (-1.69) 
‘Tropics 0.174* 
(1.87) 
Asia -0.051 
(-0.51) 
E. Europe 0.099 
(1.58) 
L. America -0.114 
(-1.16) 
MENA -0.001 
(-0.01) 
SSA 0.133 
(1.30) 
Democratic trad. 0.009** 
(2.39) 
Constant 0.575*** 0.715** 0.356* 0.606*** 
(8.93) (10.62) (1.79) (3.14) 
Observations 158 142 151 66 
Adjusted R? 0.279 0.384 0.349 0.357 
F-Test 17.779 20.983 12.200 10.320 


Heteroscedasticity robust standard errors used; 
t statistics in parentheses 
* p < 0.10, ** p< 0.05, *** p< 0.01 


Table 2.3: Influence of geographic and historical variables on Atlas Narodov Mira ELF scores. 


tion. None of the regional dummies are significant at conventional levels. Latitude, which 


was highly significant in all previous regressions, loses its significant explanatory power 


when the regional dummies are included. This is not too surprising as the regional division 


partly reflects the distance from the equator. Additionally, Tropics seem to better capture 


the idea of a different habitat around the equator. Nevertheless, the major geographical 


variables, Altitude and Area, actually maintain their significance at the 10% and 1%level. 


Thus, latitude, per se, is not the driver of a different level of heterogeneity but the different 


geographical and climatic conditions found along the latitudinal stretch. 


More democratic regimes are considered to give their citizens more freedom of personal 


expression and therefore might also exhibit a higher level of heterogeneity. Democratic 
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tradition is measured by the average Polity score after World War II (1945-1960)? devel- 
oped by Marshall and Jaggers (2008), ranging between -10 and 10. Democratic Tradition 
displays the expected positive sign at the 596 level!!! 

As the covariates did not change between the first and the second ANM data, there 
should be no different result in using the latter. As expected, there is no qualitative 
difference between the two data sets and the results remain very much comparable.!? All 
results so far are in line with the results of Ahlerup and Olsson (2007) and Michalopoulos 
(2011). As these authors do not test their hypotheses using the ANM data, but on the 
ELF indices from Alesina et al. (2003) and Fearon (2003), regressions (1) and (2) from 
Table 2.3 are replicated for both alternative sources. The results are reported in Table 2.4 
and generally support all findings discussed so far. These results give additional credibility 
to the ANM data.!? 


0 E E aj © ©) 
ANM ’85 | ANM'85 Alesina Alesina Fearon Fearon 
Latitude -0.611***  -0.460***  -0.691*** = -0.566***  -0.739***  _0.548*** 
(-5.65) (-4.12) (-6.72) (-5.56) (-7.07) (-5.37) 
Altitude 0.066 0.143** 0.037 0.099 0.068 0.163*** 
(1.30) (2.35) (0.68) (1.51) (1.46) (2.86) 
Ln (Area) 0.024*** 0.041*** 0.025*** 0:041*** 0.018* 0.032*** 
(2.66) (4.16) (2.87) (4.62) (1.82) (3.15) 
Agritime -0.015* -0.021** -0.005 -0.009 0.003 -0.009 
(-1.76) (-2.56) (0.55) (-1.00) (0.37) (0.98) 
Modern -0.027*** -0.025*** -0.029*** 
(-4.93) (-4.26) (-5.65) 
Constant 0.575*** 0.715*** 0.550*** 0.670*** 0.569*** 0.746*** 
(8.93) (10.62) (9.62) (10.23) (7.63) (9.87) 
Observations 158 142 160 143 150 139 
Adjusted R? 0.279 0.384 0.305 0.407 0.282 0.410 
F-Test 17.779 20.983 21.762 21.484 18.554 24.314 


Heteroscedasticity robust standard errors used; ¢ statistics in parentheses 
* p « 0.10, ** p < 0.05, *** p < 0.01 


Table 2.4: Comparison between various ELF measures — influence of geographic and historical 
variables 


Exceeding the scope of the studies by Ahlerup and Olsson (2007) and Michalopou- 
los (2011), this essay investigates more deeply on the grounds of colonization. Table 2.5 


shows the main results. Regression (1) is the aforementioned setup for the full set of 


110 Alesina and Zhuravskaya (2011) use a comparable time frame to assess a democratic tradition variable. 
If the time frame for the Democratic Tradition variable is extended to 1900-1960, the results do not change, 
but the observations are further reduced. 

1114 caveat is that the inclusion of the Democratic Tradition variable nearly halves the number of ob- 
servations. Using alternative data sources (e.g., Vanhanen (2000)) have the same limitations. That is also 
why these variables are not included in coming regressions, unless if explicitly controlling for the role of 
democracy. 

112Results are reported in Table B.4 of Appendix B.2. 

H31f not otherwise stated, the data of 1985 is subsequently used for the regressions as it contains more 
observations than the data from 1961. 
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countries. In regression (2), a Colony dummy is included to control for the possibility 
of former colonies generally exhibiting differences in their ethnic fragmentation from non 
colonial countries. Former colonies are attributed with an approximately 17% lower level 
of heterogeneity, whereas the regional dummies for Sub-Saharan Africa and Latin America 
are not significant. This result could be driven by the linguistic bias of the heterogeneity 
data. Especially in Latin America, the colonial regime left a common language. Regres- 
sion (3) tries to prove this by entering interaction terms for the colony and the regional 
dummies for Latin America and Sub Saharan Africa.“ Although they have the expected 
sign, Latin America negative and Sub-Saharan Africa positive, both are not significant at 
conventional levels. The result of the Colony variable is not altered greatly. The longer 
the colonial powers stayed the more settlers might have domiciliated in the new countries 
permanently. Aligned with earlier findings of Ahlerup and Olsson (2007), the colonial 
duration (Duration) has a positive, but barely significant, impact on fragmentation as 
displayed in regression (4). Controlling for the colonizer's homeland in regression (5), one 
finds no significant correlation for Spanish and British colonizers, and a barely significant 


one for former French colonies. 


In regressions (6) and (7), the idea of Acemoglu et al. (2001) is picked up upon, ex- 
ploring the implication of how, rather than by whom, countries were colonized. The 
'divide-and-rule' approach simply exploited the country without any long term interest. 
In other countries, however, colonizers established institutions to sustain a long term 
development and settlements. Acemoglu et al. (2001) attribute differences in these two 
approaches to the differences in living conditions the colonizer came upon at that time, i.e., 
the mortality rate amongst the Europeans arriving in their respective colonies. In coun- 
tries with higher mortality rates, the colonizers did not want to create lasting structures 
and institutions intended for long term settlements. In the absence of good institutions, 
ethnic identification is more important to sustain group cohesion and economic activities. 
A more extractionary colonization approach, additionally, exploited differences between 
groups, deepening them and turning the groups against each other, leading to a higher 
or deepened heterogeneity. Higher mortality rates, leading to worse institutions and more 
ethnically motivated turmoil, should be attributed with more heterogeneous countries. In- 
deed, Mortality displays a significant positive correlation with the level of fragmentation. 
Including the Mortality estimate also affects the colonizer homeland dummies, rendering 


the British dummy significant at the 5% level.116 


M4The dummy for Latin America drops out due to perfect collinearity with the Colony dummy. 

115 An additional caveat is that it is hard to distinguish whether the effect reflects a reverse causality, and 
colonial powers just chose more homogeneous countries for their colonization efforts. 

N6Ty cluding the mortality variable increases the explanatory power of the model, increasing the adjusted 
R? from 0.39 to 0.52 between regressions (1) and (5). However, the number of observations decreases again 
significantly. Alternatively using the extended data on early disease environment compiled by Auer (2009) 
increases the number of observations slightly but does not yield significant results for this variable and 
again reduces the level of the adjusted R?. 
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(1) (2) (3) (4) (5) (6) (7) 
Full sample Full sample Full sample Colonies Colonies Colonies ^ Colonies 
Latitude -0.460*** -0.639*** -0.668*** ^. .0.839*** .0.868*** .-0.769*** — -0.923*** 
(74.12) (-3.84) (-3.60) (4.42) — (3.69)  (-3.82)  (-4.46) 
Altitude 0.143** 0.158*** 0.160** 0.147* 0.184* 0.201** 0.317*** 
(2.35) (2.69) (2.30) (1.69) (1.83) (2.23) (2.91) 
Ln(Area) 0.041*** 0.045*** 0.046*** 0.068*** — 0.063*** — 0.063*** — Q.049**** 
(4.16) (4.65) (4.04) (6.08) (5.13) (4.62) (3.65) 
Agritime -0.021** -0.032*** -0.033*** -0.045**  -0.053*** = -0.042** — -0.053** 
(-2.56) (-2.67) (-2.64) (2.64) — (2.931) | (2.05) — (-2.44) 
Modern -0.027*** -0.026*** -0.025*** — .0.037*** .0.032*** -0.038*** -0.007 
(-4.93) (-4.03) (-3.80) (-5.36) (-3.43) (-4.30) (-0.73) 
L. America -0.009 
(-0.15) 
SSA 0.088 -0.039 
(1.48) (-0.17) 
Colony -0.171** -0.187** 
(2.11) (-2.27) 
Colony*LA -0.009 
(-0.13) 
Colony*SSA 0.129 
(0.57) 
Duration 0.003 0.000 
(1.65) (0.14) 
Spanish colony 0.069 -0.119 
(0.92) (-1.47) 
French colony 0.100* 0.069 
(1.87) (0.90) 
British colony 0.062 0.155** 
(0.95) (2.19) 
Ln(Mortality) 0.034* 0.052*** 
(1.99) (3.34) 
Constant 0.715*** 0.855*** 0.870*** 0.733*** 0.722*** 0.658*** 0.327* 
(10.62) (6.43) (5.78) (9.11) (7.09) (4.42) (1.95) 
Observations 142 142 142 86 85 59 58 
Adjusted R? 0.384 0.417 0.414 0.415 0.381 0.515 0.545 
F 20.983 15.811 12.088 15.825 10.368 11.656 8.661 


Heteroscedasticity robust standard errors used; ¢ statistics in parentheses 
* p « 0.10, ** p < 0.05, *** p< 0.01 


Table 2.5: Influence of various colonization characteristics on Atlas Narodov Mira ELF scores 


Analyzing the influence of evolutionary and historical factors, two important insights 
become clear. First, earlier findings with different data sets, showing that geographical 
attributes (especially Altitude, Area and Latitude) are highly responsible for the ‘base level’ 
of heterogeneity, are confirmed. Second, attention is drawn to the role of colonization. This 
essay argues that the homeland of the colonizers is less important for a former colony’s 


heterogeneity than how the colonial powers actually pursued their endeavors. 
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2.5.2 Drivers of fragmentation level changes over a short period 


Table 2.6 reports the first results of the regressions based on Equation (2.3). It contains 
all variables that display a change over the period covered, i.e., variables of the vector 
AX;. Latitude, Altitude, Area, Agritime and the ANM values in 1961 are included as 
static variables of vector Z;. Although the variables of vector Z; do not show any changes 
over time, they might have a mediating role for any adaptations. This is why they are 
controlled for in all regressions in this table. However, almost none of the variables are 
significant at conventional levels.1!7 

As discussed earlier, data availability in the early 1960’s poses a major limitation to 
the regressions. This essay tries to make the best possible trade-off between including 
additional variables and thereby reducing the risk of omitted variables by not downsizing 
the number of observations available too much. 

Regression (1) controls for the most important changes in developing countries regard- 
ing their settlement and population patterns. Metropolitan areas attract people from the 
countryside with the prospect of a better economic future. Many old traditions are left be- 
hind, and one tries to merge into the more mainstream culture of major cities. The change 
in Urbanization, measured as the percentage of the population living in urban areas does 
indeed have a significant negative impact on the level of heterogeneity. As expected, 
the most obvious effect of Immigration on heterogeneity is positive. Both are significant 
at the 5% level. Comparing both effects, immigration plays a bigger role. An increase of 
one standard deviation in the change in immigration increases the change in heterogeneity 
by 0.44 standard deviations, whereas the same change in urbanization leads to a decrease 
of -0.19 standard deviations. Population growth (Population) shows no significant impact 
at conventional levels in this initial regression. 

In regression (2), primary schooling rates (Primary Schooling) are included.!!? This 
variable not only covers educational attainment, and, to a large extent, the overall level 
of education in a country, but can also be understood as a proxy for state influence on 
an increasing part of the population. It shows a significant negative impact and lowers 
the size and significance level of both Urbanization and Immigration. Primary School- 
ing and Immigration display the highest impact with beta-coefficients of -0.26 and 0.41, 
respectively.1?0 
Controlling for various other variables in regressions (3)-(6), the significant influence of 


Urbanization, Immigration and Primary Schooling persists, at least at the 5% level. Nei- 


Results are reported in Table B.5 of Appendix B.2. 

HS A recent survey on Africa also found that a higher degree of urbanization alters ethnic identification 
in favor of national identification (Robinson, 2009). 

H9 Primary Schooling is measured as the average years of primary school attainment, provided by Barro 
and Lee (2010). 

120 Green (2011) does not apply level variables and use different time frames with comparable time frames 
allowing adaptation in the ethnic fragmentation. He finds an equally important influence of urbanization. 
However, migration is in his analysis only relevant in highly urbanized countries. In contrast to the findings 
here, education is no focus in Green (2011). 
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ther a change in the level of democracy (Polity IV), the number of conflict years ( Conflicts) 
nor Trade and Infrastructure (Telephones) show any significant impact.'?! Countries with 
higher population growth rates demonstrate a significant positive impact in two regressions 
(3)/(5). By including changes in logged GDP per capita levels ( GDP/capita) in regression 
(7), this makes all variables, except for Primary Schooling and Urbanization, insignificant. 
Although the variable carries the expected negative sign, it is also insignificant at con- 
ventional levels. Most of the socioeconomic policy variables are very strongly associated 
with higher wealth levels of a country, reflected in growing GDP per capita levels. The 
fact that Primary Schooling and Urbanization remain significant, although the GDP per 
capita increase is included, confirms that they are no transmission channel leading to a 
higher GDP. Regression (7) is also the only one where Immigration loses its significance. 
As immigrants are attracted to prosperous countries, i.e., countries with high GDP per 
capita growth rates, a high correlation with immigration is inevitable. Controlling for 
regions in regression (8), does not alter the results, although the significance levels are 
lower. In addition, because none of the regional dummies are significant at conventional 


levels, the results are not driven by regional differences. 


As has already been pointed out in the discussion of the economic and policy factors, 
it is hard to see why GDP per capita levels should have a direct impact on heterogeneity. 
The regressions in Table 2.6 already showed some influential factors that all are highly 
linked to GDP per capita its rate of growth. However, as the overall economic development 
of a country plays a crucial role, it is also controlled for here. This is done more so as an 
additional robustness check, rather than to generate new insights than. Taking selected 
regressions from Table 2.6, in Table 2.7 various measures of (economic) development are 
included. Regressions (1) and (2) are those already known. In regressions (3) and (4), the 
GDP per capita level in 1960, based on the Penn World Tables (Heston et al., 2009), is 
added to the otherwise unchanged setup.!?? Urbanization and Immigration, which show 
the highest correlation with the GDP per capita level become insignificant. Instead, the 
GDP per capita level at the beginning of the period is positive in all regressions, at least 
at the 1096 level. Primary Schooling shows lower, but still significant values if the GDP 
per capita level is included. If GDP growth (change in GDP per capita levels) is included, 
the significance fades. This has two important interpretations. First, the results for 
Primary Schooling are robust. Although the GDP per capita level variable absorbs some 
of its influence, its significance does not change considerably. Countries that are richer 
already have much higher primary schooling figures, so changes would be expected to be 


smaller. Still, the influence persists. Second, countries that already have a higher level 


121 That conflicts have no impact on the ethnic heterogeneity is rather surprising. However, if one uses 


different conflict data sources (PRIO data (Gleditsch et al., 2002), Genocides and Political Instability Task 
Force (PITF) data (Marshall et al., 2010)) the non-significant result is confirmed. 


122q 


l'he results displayed are based on the Laspeyres index of the Penn World Tables. The regressions 
with the Chain index yield the same results. 
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of development seem to move in the other direction, thus becoming more heterogeneous. 
Most of the highly developed countries are classic immigration countries, like the US, 
Australia and Canada. This hints to a curvilinear relationship of ethnic identification and 
development or ‘modernization’, discussed in Bannon et al. (2004). Ethnic fragmentation 


is not necessarily a sign of backwardness. 


m 3 0) a ©) (6 
ANM ch. ANMch. ANMch. ANMch. ANMch. ANM ch. 
Ln (Urbanization) -0.048* -0.062** -0.030 -0.022 0.022 0.027 
(-1.98) (-2.43) (-1.02) (-0.74) (0.63) (0.75) 
mmigration 0.005* 0.005 0.004 0.004 0.002 0.001 
(1.85) (1.26) (1.16) (117) (1.19) (1.13) 
Ln (Population) -0.006 -0.017 0.002 -0.016 -0.024 -0.023 
(-0.09) (-0.29) (0.03) (-0.27) (-0.64) (-0.64) 
Primary Schooling -0.056** -0.031* -0.031* -0.025 -0.036** -0.025 
(-2.36) (-1.77) (-1.94) (-1.57) (-2.12) (-1.53) 
Ln (GDP/cap.) '61 0.020* 0.022** 
(1.85) (1.99) 
Ln (GDP/cap.) -0.020 -0.028 
(-0.94) (-1.37) 
HDI level 0.142*** — 0.155*** 
(3.40) (3.66) 
HDI -0.366** 
(-2.43) 
Constant 0.069** 0.066** -0.113 -0.115 -0.021 -0.004 
(2.10) (2.16) (-1.24) (-1.27) (-0.55) (-0.11) 
Level var. included yes yes yes yes yes yes 
Observations 116 91 91 91 98 98 
Adjusted R? 0.254 0.160 0.194 0.196 0.180 0.219 
F- Test 2.753 3.162 2.904 3.024 3.258 3.191 


Included level variables (Z;): Latitude, Altitude, Area, Agritime and the ANM values in 1961 
t statistics in parentheses 
* p « 0.10, ** p « 0.05, *** p< 0.01 


Table 2.7: Influence of various economic and human development levels at the beginning of the 
period (average 1960-65) — dependent variable, change in Atlas Narodov Mira ELF scores 


Regressions (5) and (6) use Human Development Indicator (HDI) levels (UNDP - United 
Nations Development Programme, 1994). This is a broader indicator of development, not 
only taking into account GDP per capita levels, but also health and education figures. In 
general, the results are very much comparable to the results discussed above. The broader 
construction of the HDI, especially the inclusion of education variables, explains why the 
HDI variable is the only one where the change variable also has a significant and negative 


impact, taking over the influence of the Primary Schooling variable. 


As further robustness checks, the key regressions of Table 2.6 are run again with dif- 
ferent model specifications. Both fixed-effect (FE) and random-effect (RE) models are 
tested. The results are reported in Table B.6 of Appendir B.2. Using the FE model, a 
correlation between the entity specific error term and the explanatory variables is allowed. 
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Furthermore, all level variables that are time-invariant are removed from the regressions. 
The RE model, in contrast, assumes independence between the entity error term and the 
explanatory variables. From the discussion above, the superior suitability of the FE model 
is clear.!?° Although the values of the coefficients vary, the significant positive or nega- 
tive effects of the main variables, Urbanization, Immigration and Primary Schooling, are 


clearly confirmed. 


Because Primary Schooling seems to play a crucial role, Table 2.8 depicts the influence of 
different measures of education, as well as various education levels, to test for the robust- 
ness of the finding.!?^ Regression (1) corresponds to the second regression in Table 2.6. 
Immigration and Primary Schooling are both significant. In regression (2), Secondary 
Schooling and Tertiary Schooling are included in addition. The coefficient of Primary 
Schooling remains significant and increases in size. Looking at the role of higher educa- 
tion, reveals another interesting insight. Secondary Schooling enters the regression with a 
significant and positive sign. Higher education apparently has a different effect on frag- 
mentation than primary education. While the effect of primary education is uniformly 
negative, secondary education is mostly positive. 

Regressions (3)-(7) confirm the findings with different measures of education, offered 
by Barro and Lee (2010). The total sum of all years of schooling (Schooling total) does not 
show any significant impact. This is not surprising. As primary and higher educational 
levels enter the regression with opposite signs, they seem to cancel each other out. All 
other regressions confirm the homogenizing impact of primary education. In most cases, 
the positive impact of higher education is also confirmed. However, the coefficients are no 
longer significant. These robustness checks confirm the apparent importance of primary 
schooling for a country's homogenization and do not depend on the definition or measure 
of primary education. 

In section 2.4, the time frame chosen was discussed. For endogeneity reasons, as well 
as the time needed for potential adjustments in heterogeneity, the time frame 1960/65 
1975/80 was chosen. Nevertheless, the results should not entirely depend on the choice of 
the time frame. As an additional robustness check, the time frame for all policy variables 
was changed from 1960/65-1975/80 to 1960/65-1980/85. The results are reported in Table 
B.7 of Appendix B.2. Although the coefficient sizes vary slightly, the significance levels of 


all variables discussed only change marginally. 


123 However, the Hausman test supports the FE model in only half of the regression pairs. This is the 
case for the regression pairs (2/6) and (3/7). Results of the Hausman test are not reported here. 

124Tndeed, Bossuroy (2011) also identifies lower educational levels to have the most robust and sizeable 
positive impact on one’s ethnic identification. The higher the level of educational attainment, the more 
individuals from surveys in West Africa identified with the nation instead of one’s ethnic group. This 
consequently lead to lower ELF levels. 

125]n their analysis of education’s role on trust, Knack and Keefer (1997) find a comparable differentiated 
result for primary and secondary education. Additionally, in an analysis of ethnic identification for a small 
set of African states, Bannon et al. (2004) find that students identify themselves more along ethnic lines 
than farmers. 
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2.6 Conclusion 


In line with the recent publications of Ahlerup and Olsson (2007) and Michalopoulos 
(2011) on the roots of ethnically diverse countries, the major results are confirmed. Al- 
though different data and data sources were used, the results remain robust. Geographical 
characteristics, like a country’s surface and altitude variation, and evolutionary factors, 
like the transition from sedentary farming, are major drivers of a ‘base-level’ of ethnic 
fragmentation. A more detailed view on colonization is also added to the analysis of geo- 
graphical and historical factors. Whereas the homeland of the colonizer seems to play no 
major role, the way a country was colonized does show a significant impact. Countries, 
where colonial powers did not have any incentive to settle and build good institutions, 
instead exploiting the country’s resources, show a significantly higher level of ethnic frag- 
mentation. The manipulation of ethnic boundaries seemed to be an easy way to play one 
group off against the other. Mistrust and rifts between ethnic groups seem to persist after 


independence — mirrored in higher fragmentation levels. 


What this chapter mainly wants to add to the recent discussion is that ethnic frag- 
mentation cannot be treated as being exogenous, or only being rooted in geographic and 
historical factors. Especially since the beginning of the 20th century, various policy and 
economic factors significantly changed the dynamics between ethnic groups, their inter- 
change and assimilation, as well as migration patterns. Migration proves to be the most 
important factor in changing a country’s fragmentation. Gulf countries, relying heavily 
on immigrants, show this trend most clearly. Doubtlessly, migration plays an even big- 
ger role in the globalized world after the 1960-1985 period analyzed in this essay. Its 
impact might therefore be even more pronounced today. The same is true for the other 
variables shown to have a significant impact on a country’s ethnic fragmentation. More 
policy-induced variables, like urbanization and especially primary education, leave their 
marks on a country’s heterogeneity. Urbanization and the growth of metropolitan areas, 
attracting huge parts of the population, lead to an erosion of old habits and to an assim- 
ilation into, or the emergence of a ‘mainstream’ culture. Education is, according to the 
findings of this chapter, not only a measure of a higher educational level attained. Because 
primary education is, in general, the first point of contact with the state authorities, it is 
also a good proxy for the government’s influence. By expanding the government’s reach 
for more remote areas, more and more people are exposed to its influence. In line with 
recent findings of other authors, education does not influence heterogeneity uniformly. 
The empirical results suggest that higher educational levels lead to a more heterogeneous 
society. 

Nevertheless, this essay also faces some limitations. The range of possible variables to 
be tested in this analysis is rather confined due to data limitation in the early 1960s. Only 
data on ethno-linguistic fragmentation, and not on other concepts regarding language 


or religion, were available. In line with Campos and Kuzeyev (2007), the distinction 
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between ethnic, linguistic and religious fragmentation could be an interesting field for 
future research. Not only could these different characteristics be driven by different factors, 
the time span in which changes occur and their direction might also be different. Both 
Campos and Kuzeyev (2007) and Fedderke and Luiz (2007) find more significant changes 
in the ethnic and racial setup than for the linguistic and religious characteristics. As the 
ANM data is mainly defined along linguistic characteristics, limited changes in its data 
may mask some results. 

Admitting that a country’s ethnic setup changes and can be influenced, turns one 
back to the growing literature on the effects of ethnic fragmentation. Having seen that 
ethnic composition is changing with variables that are highly linked to the development 
level of a country, using a fixed measure of ethnicity for economic growth analysis seems 
rather unreasonable. This would attach greater importance to older measures of the 
ex-ante ethno-linguistic composition of a country in the analysis of economic outcomes, 
because the ethnic setup may have been endogenously determined by the factors under 
investigation. This is exactly what Campos and Kuzeyev (2007) find for their data set 
on former Soviet republics. Whereas the effect of an exogenous heterogeneity measure on 
growth is limited, the dynamic measure illustrates a significantly negative effect. 

Despite its limitations, the set of variables and data used for this essay show clear 
and very robust results. They are a very good basis to refute the assumption of static 
ethnic heterogeneity. More than a caveat, this essay offers a first attempt to venture into 
the dynamics of ethnic heterogeneity and gives a better understanding in to how policy, 


intentionally or unwittingly, can shape a country’s ethnic setup. 
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Chapter 3 


Measuring Ethnic Diversity 


3.1 Introduction 


There is a fast growing literature on ethnicity and its role in the economic development 
of a country or the incidence of conflicts.!?6 To advance the research in this area, current 
approaches try to improve data sources, to increase its coverage, and to construct indices 
to better measure its complexity. Because ethnicity is not a clear cut concept it contains 
various aspects. Therefore, better indices in this regard do not mean more accurate 
indices but rather those that reflect the different aspects more adequately. Starting with 
the ethno-linguistic fractionalization index (ELF) by Taylor and Hudson (1972), an index 
on polarization (Garcia-Montalvo and Reynal-Querol, 2002), the reduction to politically 
relevant groups (Posner, 2004a) or the role of regional segregation of ethnicity (Alesina 
and Zhuravskaya, 2011) have been studied more intensively.1?7 

All these indices, however, are based on pre-defined groups within a country or principal 
region. This gives rise to an important problem. All calculations rely on a rather arbitrary 
definition of groups that do not necessarily share a comparable line of differentiation.!?® 
Fearon (2003) summarizes the absence of a clear-cut definition of ethnic groups and states, 
maintaining “that in many cases there is no single right answer to the question ‘What 
are the ethnic groups in this country?” (Fearon, 2003, p. 197). To be less arbitrary, a 
common differentiator, be it on the grounds of ethnicity, language, religion, or any other 


characteristic need to exist. So, an assessment of distances between groups “is such an 


126Ethnic fractionalization is supposed to negatively affect corruption (Mauro, 1995), economic growth 
(Alesina et al., 2003; Easterly and Levine, 1997), public goods provision (Alesina et al., 1999), communal 
participation (Alesina and La Ferrara, 2000), general quality of government (Alesina and Zhuravskaya, 
2011; La Porta et al, 1999) and democracy (Akdede, 2010). Collier (1998) initiated a new, and now 
broad strand of literature exploring ethnicity's impacts on the incidence, onset or severity of conflicts that 
was furthered by the introduction of an index of polarization (Garcia-Montalvo and Reynal-Querol, 2003, 
2005a, 2008). 

127 For a broad overview of the literature on conflict, see Blattman and Miguel (2010). A good description 
of concepts and measures of ethnicity is found in Brown and Langer (2010). A new approach to better 
study ethnic distribution at the micro-economic level is to geo-reference ethnic groups (Weidmann et al., 
2010). 

128For a similar line of critique, see Lind (2007). 
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absolutely fundamental concept in the measurement of dissimilarity that it must play an 
essential role in any meaningful theory of diversity or classification” (Weitzman, 1992, p. 
365).!2° This, however, requires more detailed information on the groups so that they 
show a comparable level of distinction in any of the characteristics. Nearly all authors 
treat these attributes equally irrespective of the differences between the groups, i.e., how 
big the distance is. This is mainly because data on the different similarity levels are 
either hardly available, or quite complex. Thereby, it is obvious that two groups whose 
respective members speak two completely different languages, follow different religions 
and have different physiognomic attributes, are more distant than two groups that share 
similarities in their languages, follow the same religion and have a similar appearance. 
This underlines the key difference between the diversity concept and the fragmentation 
and polarization indices. For many economic problems, it is not the pure number of 
groups that is of interest, but rather how difficult coordination or instrumentalization 
between the various groups is. In more diverse countries, agreement on public goods (e.g., 
infrastructure or social security systems) is more difficult (Alesina et al., 1999), the level of 
generalized trust lower (Bjornskov, 2008) and the incidence of conflicts higher (Collier and 
Hoeftler, 2002).!°° The main aim of this essay is to fill this gap and to offer an index taking 
these aspects into account. The global data set offers the possibility to construct an index 
covering the degree of diversity between groups within countries, as well as the cultural or 
ethnic (dis)similarity between countries. A measure of cultural affinity which extends the 
rather crude measure of genetic distance should affect international trade flows. Assessing 
this new multi-faceted index is thus the base to further expand current research on the 


implication of ethnicity with a new aspect of cultural distance, i.e., its diversity. 


The remainder of this chapter is structured as follows. Section 3.2 briefly summarizes 
the current discussion surrounding the conceptual und measurement problems. In section 
3.3, the theoretical background of the new similarity parameters is outlined. Section 3.4 
introduces the data sources used. Section 3.5 discusses the operationalization of the new 
distance adjusted ethno-linguistic fractionalization index (DELF), and compares it with 
existing measures. Section 3.6 outlines the resulting new diversity values for a range 
of countries. In a second step, a (dis)similarity measure between countries, based on 
comparable premises, is set up and discussed. Finally, section 3.7 summarizes the key 


findings, concludes and gives an outlook for further research. 


129For a good, yet methodological-technical discussion of the prerequisites to measure diversity, see Bossert 
et al. (2003) and Nehring and Puppe (2002). Both rely on the earlier concept developed by Weitzman 
(1992). 

130To be precise, ethnic fragmentation or diversity per se is not the cause of the various (negative) 
socio-economic outcomes. However, both settings offer more possibilities to exploit these distinctions. 


Philipp Kolo - 978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


Chapter 3. Measuring Ethnic Diversity 61 


3.2 Different aspects of ethnicity and its measurement 


Alesina et al. (2003) describe ethnicity as a “rather vague and amorphous concept” (Alesina 
et al., 2003, p. 160) that makes any measurement hard to grasp.!?! To better operationalize 
ethnicity, this essay follows Chandra and Wilkinson (2008). According to them, ethnic 
structure comprises a set of ethnic identities that includes all phenotypical attributes (skin 
pigmentation or body figure), as well as religion, language and the traditions one was raised 
in. This is very much in line with Barrett et al. (2001), whose data is used later on in this 
chapter.!?? Following these authors, ethnicity is defined in this chapter along language, 
ethno-racial (ethnic origin, skin pigmentation and race) and religious aspects. 

Defining the characteristics of ethnicity in detail, which is already more diligent than 
most papers in this field, is not sufficient for what this essay strives for. Within each of 
the defining criteria a (dis)similarity level between two distinct groups must be assignable. 
Information on the degree of (dis)similarity is the crucial starting point in any assessment 
of diversity (Bossert et al., 2003). Despite the reluctance of many authors to define the 
characteristics of ethnicity, a more thorough examination of similarity differences has not 
been discussed at all. Distance between groups neither influenced the decision of how to 
draw the line between groups, nor the interpretation of the fractionalization found. Taking 


language groups as an example, one could divide groups based on mere dialects, different 


languages or even different language families. Depending on the level of similarity between 
groups, different group setups would then emerge.!?? In this case, the amount of common 


vocabulary would define their distance. 


Based on the defined number of ethnic groups, the question of its mathematical op- 
erationalization arises. ^ The most common measure for ethnicity is its fractionaliza- 
tion, known as the ethno-linguistic fractionalization index (ELF). It is calculated as an 


Herfindahl-Hirschman concentration index: 


K 
ELF-1-M p, i=1,..K (3.1) 

i=1 
where K is the number of groups i and p; their relative group sizes. Its value moves between 
zero and one and represents the probability that two randomly selected individuals from a 
population come from different groups. A higher value thus indicates a more fragmented 


country, i.e., a country with a higher number of distinct ethnic groups. A value close to 


131 Brown and Langer (2010) offer a broad summary of the recent discussion surrounding the definitions 
of ethnicity as well as its measurement problems. 
132 They include language, ethnic origin, skin pigmentation, race, culture or religion, and nationality as 
characteristics to describe ethnicity. 
133 For a discussion on how different levels of aggregation of linguistic fragmentation affect the outcomes 
in the analysis of ethnic conflicts, see Desmet et al. (2012). 
13dGinsburgh and Weber (2011, Ch. 6) offer a good overview of the different classes of indices used, their 
historical development and recent applications. Desmet et al. (2009) compare the effect of most of these 
different indices on the level of redistribution. 
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one indicates high fragmentation within countries. After the introduction of the ELF by 
Taylor and Hudson (1972), based on the data of the Atlas Narodov Mira (Bruk, 1964), 
several additional indices were developed. The second most prominent of these is the 
measure of polarization introduced by Garcia-Montalvo and Reynal-Querol (2002).19 It 
shows a completely different aspect of a country's ethnic setup, and underlines that for 
each economic problem under analysis, the adequate index needs to be applied. Assessing 
the variation away from an even 50/50 split of two groups, Garcia-Montalvo and Reynal- 
Querol (2002) find that this index is a much better predictor of conflict incidence than the 
ELF measure. It apparently better measures the ethnic constellations responsible for an 


uprising. The polarization index (POL) is defined as: 


us 0.5 — pi 2 i 2 
POL-1-M ae - Dis i=1,...K (3.2) 
j=l T 


pi are again the relative group sizes of groups 7. The POL index is also tending towards 
zero for very homogeneous countries, i.e., with only one group. However, with increasing 
group numbers, ELF and POL show clearly different courses. Figure 3.1 shows these 
differences based on equally sized groups. While ELF is an increasing function of the 
number of groups, POL reaches its maximum with two equally sized groups and decreases 
afterwards. This clearly underlines that the indices do in fact measure two different things 


although they are based on the same data. 


ELF/POL 


0.6 


Figure 3.1: ELF and POL values depending on the number of equally sized groups 


Bossert et al. (2011) introduce a more flexible version of the ELF, the generalized ethno- 
linguistic fractionalization index (GELF). The technical side of the index brings two impor- 


tant improvements. Firstly, it does not rely on pre-defined groups but takes the individual 


135 Their approach goes back to earlier work of Esteban and Ray (1994). 
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and its specific characteristics as a starting point.'°° Based on the specific characteristics, 
a mutual similarity matrix between individuals takes the distance between them into ac- 
count. Hereby the groups emerge ‘endogenously’ from the matrix. The similarity value 


between two individuals ö and j for all i,j € (1,..., N} is given through s;;, with: 


1> sj 20 (3.3 
Sg = | (3.4 
Sij = Sji (3.5) 


A similarity value of one indicates perfect similarity, whereas a value of zero would indicate 
two individuals that do not share any characteristics. For a society with N individuals, 
all {sij} are contained in a N x N matrix, labeled similarity matrix Sy, which is the main 
building block of the GELF. Based on this matrix, the corresponding GELF value for a 


country with N individuals is given through: 


1 NN 
G($y) -1- ILL su (3.6) 


i=1j=1 


GELF is then the expected dissimilarity between two randomly drawn individuals. As data 
on individuals are seldom available, the transfer to group-specific data on the smallest 
aggregation level is needed. The adaptations are, however, rather small. In a society 
with N individuals, K groups exist with respective populations of mj, individuals for all 
k € (1,..., K}. It holds that Xa mj = N and py — mj, /N is the respective relative group 
size. The individuals in each group are all perfectly similar, i.e., their mutual individual 
similarity values would be one. By grouping all individuals together that share similarity 
values of one, groups emerge ‘endogenously’. The similarity between two groups, k and l, 
is denoted as 3; and is equivalent to the individual similarity value s;; for any i € mj, and 


je mı. In rearranging Equation (3.6), it follows that: 


1 KK 
G($.) -1- mL my m Sy 
k=11=1 
zu mE TM 
I ER 
k=11=1 NN 
KK 
=1-),), pr pi êk = DELF (3.7) 
kil 


The relation between the DELF and the ELF index is quite obvious. The ELF is based 
on groups that either have a similarity value of one, given both belong to the identical 


group, and zero otherwise. Thus, the products are always zero if two different groups 


136 This, however, is the main drawback of its operationalization, as reliable data on individuals are seldom 
available, especially in developing countries. 
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are matched. A value of one is only assigned if the groups are matched with themselves, 
leading to a value of (pr pr 1) = D and (py: pi: 0) — 0, respectively. The sum over all K 
groups then directly leads to Equation (3.1), where the ELF is specified.!?" The important 
improvement in this approach is that it does not rely on pre-defined groups, thus avoiding 
to treat groups as equal that actually have very large distances between them.!?5 
Finally, de Groot (2009) assessed the ethnic affinity between African nations.'°? In 
doing so, he also draws on the articles of Fearon (2003) and an earlier version of Bossert 
et al. (2011), and is closest to the approach of this essay. De Groot (2009), however, only 
offers data on ethnic affinity between countries and limits his assessment to Africa. This 


essay consequently extends the work of all three studies. 


3.3 Calculation of the distance values 


For the calculation of the distance values, this essay draws on Fearon (2003). His approach 
is adapted for three ethnicity characteristics: language, ethno-racial and religious identifi- 
cation. Taking a broader set of characteristics and similarity measures into account offers 


a more multifaceted picture.!40 


3.3.1 Language classification 


Language is probably the most researched and operationalized characteristic.!4^! As is the 
case with a family tree, languages can be ordered in accordance with their mutual related- 
ness. The distance between the branches gives a measure of their degree of (dis)similarity. 
This is well analyzed and operationalized by the Ethnologue project (Lewis, 2009). To 


uniquely identify each language, it assigns each one with a three letter code. The de- 


137 Note that due to the construction of Equation (3.7), DELF values take into account mutual similarity 
values between groups that are not fully identical and will therefore always be lower than the ELF values. 
The DELF delivers the same result as a monolingual weighted index proposed by Greenberg (1956) and 
used by Fearon (2003) in his calculation of ‘cultural fractionalization’. Further attributes of the new index 
and its relation to the other indices (ELF and POL) are discussed in Garcia-Montalvo and Reynal-Querol 
(2005a, 2008) and Esteban and Ray (2011). In the latter, the index is labeled as the ‘Greenberg-Gini’ 
index. 

138 The superior theoretical explanatory power of such an index is also discussed in Ginsburgh and Weber 
(2011). 

139The ethnic linguistic affinity (ELA) of de Groot (2009) measures, in contrast to the ELF, the amount 
of characteristics shared between two countries and thus follows an inverse logic. Because it is the most 
widely propagated, this essay follows the logic of the ELF, where higher values denote more fragmented 
countries. 

140 Ginsburgh (2005) and Ginsburgh and Weber (2011, Ch. 3) offer an introduction into alternative meth- 
ods to assess the distances between groups, especially genetic and cultural distances. Genetic distance can 
be traced back to Cavalli-Sforza and Feldmann (1981). In contrast, Hofstede (2000) assesses differences 
between cultures and nations along four dimensions: power distance, individualism, masculinity and uncer- 
tainty avoidance. Comparable, but slightly different approaches, use answers from the World Value Survey 
(Desmet et al., 2011) or the voting behavior in the Eurovision Song Contest (Felbermayr and Toubal, 2010) 
to construct cultural differences between nations. 

141 Ginsburgh and Weber (2011, Ch. 3) offer a good overview of the different approaches to assess the 
distances between languages. 
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cision and categorization as a separate language (instead of a dialect) not only follows 
pure linguistic and lexical similarities, but also considers how a mutual understanding in 
communication is possible. 

This essay relies on a very closely related approach used in the World Christian En- 
cyclopedia (Barrett et al., 2001). A wide congruency of both sources exists, as the World 
Christian Encyclopedia (henceforth WCE) is one of the sources for the Ethnologue data. 
Here, a seven character code is assigned to each distinct language. A distinct language is 
defined as “the mother tongue of a distinct, uniform speech community with its own iden- 
tity” (Barrett et al., 2001, V.II, p. 245). It comprises all dialects that share at least 85% 
of their vocabulary and grammar to ensure adequate communication.!? In total, 6,656 
distinct languages are contained in the data analyzed. Two persons speaking one language 
are treated as completely similar (s;j = 1).48 The more characters of the assigned code 


two languages share, the more similar they are. The structure is depicted in Table 3.1. 


Glossocode Description Fm ne sb 

0 Macrozone 096 10 0.01 
01 Glosso-zone 596 100 0.06 
01-A Glosso-set 3096 594 0.35 
01-AA Glosso-chain 5096 1,213 0.59 
01-AAA Glosso-net 7096 2,388 0.82 
01-AAAA Glosso-cluster 8096 4,241 0.94 
01-AAA A-a Language 8596 6,656 1.00 


Table 3.1: Language similarity classification according to Barrett et al. (2001) 


The Afghan Persian (58-AACC-b) and Southern Pathan (58-ABDA-b) group share the 
first three digits and thus belong to one Glosso-set, sharing between 30% and 50% of their 
vocabulary and grammar. Subsequently, both groups are assigned a similarity value sb. 
'The assigned values are normalized on a scale between zero and one, and are matched to 
demonstrate the same decreasing slope as the lexical similarity levels. Belonging to one 
language group and thus sharing 8596 lexical similarity corresponds to the highest sb with 


sh = 1.!** In the case of the example sh takes a value of 0.35. 


3.3.2 Ethno-racial distance 


Fragmentation that is derived from a biological taxonomy of species is mainly based on 


genealogical relatedness between different people in modern humanity. The long evolu- 


H?'The same threshold is used by the Ethnologue project (Lewis, 2009), which is one of the main sources 
for the assignment of language similarity levels. The second source is Dalby and Williams (1999). The 
data and classification can also be found online under: http://www.linguasphere.info. 

143For a different way taking language differences into account, see Desmet et al. (2012). Depending on 
the similarity level defined (e.g., dialects vs. languages), different numbers of groups and thus different 
levels of fragmentation, eventually emerge. This follows on from the discussion in the introduction that 
the (arbitrary) group definition significantly impacts ELF levels. 

14For a discussion on alternative similarity values, see Appendix C.1.2. 
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tionary process is described by Ahlerup and Olsson (2007) as ‘genetic drift’. This means 
that the human species developed quite differently in various parts of the world, with one 
being able to map a genealogical tree based on the genetic congruence of the resulting 
races. Cavalli-Sforza and Feldmann (1981) created these phylographic trees by mapping 
the differences in special sections of the human DNA. Cavalli-Sforza et al. (1993) assessed 
dyadic distances between 42 world populations computed from 120 alleles in the human 


genome. 145 


This was certainly a pioneering piece of work but also demonstrates some limitations. 
The first one is the small number of groups (42) for the global classification. For Europe, 
Spolaore and Wacziarg (2009) only refer to four different genetic groups in their analysis 
of innovation and development diffusion across countries.!46 It is quite obvious that this 
might not be sufficient to describe the diversity of Europe. The second caveat is brought 
forward by Giuliano et al. (2006), who discuss in detail the use of genetic distance data 
and conclude that it is a proxy for geographical distances, rather than a proxy for cul- 
tural distances. The genes used to assess the genetic distance in Cavalli-Sforza et al. 
(1993) are only in a very limited way responsible for the phenotypical or anthropometric 
differences. The part of the DNA used is located on neutral points only subject to ran- 
dom drift, and less to evolutionary selection.!*$ However, to assess the distance between 
two human beings, with respect to their ease or willingness to cooperate, phenotypical or 


anthropometric markers should be relevant.!49 


In order to combine these views and caveats, this essay follows an ethno-racial tax- 
onomy outlined by Barrett et al. (2001). Each unique group is assigned a six character 
code based on differences of race, skin pigmentation and ethnic origin.? Although those 
characteristics are closely linked in their development, their role for mutual understanding 


differs and is treated as cumulative in the subsequent analysis.!°! 


145pue to the special location of the DNA compared, differences are caused only by a constant random 
drift. This allows one to calculate when two populations split up genetically during the course of the 
peopling of the world. 

146 For Europe, a more precise split of genetically different groups is available, but it is not possible to 
combine this with the global structures, because these data are based on a different set of genes. Ashraf 
and Galor (2011) use an extended version of genetic distance data covering 53 ethnic groups and their 
mutual heterozygosity based on Ramachandran et al. (2005). 

V Ramachandran et al. (2005) confirm this hypothesis in an analysis of their extended set of 53 popula- 
tions. They show that correlation values between different measures of genetic distance and the geograph- 
ical distance from Ethiopia is at least 0.76. 

148 However, evolutionary selection is strongly driven by the appearance of species (e.g., mating) or their 
better adaptability to the surroundings; that is mainly due to differences in their physical shape. 

M9 Caselli and Coleman (2008), for example, attribute the emergence of the conflict in Rwanda to the 
possible distinction between Hutus and Tutsis according to their body sizes. 

150This also includes some major similarities between languages to define distinct cultural groups, which 
is due to the very closely linked development of genetical and language evolution (Cavalli-Sforza et al., 
1988). 

151 This approach is also followed by de Groot (2009). 


Philipp Kolo - 978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


Chapter 3. Measuring Ethnic Diversity 67 


Analogous to the pure language case, the different levels of ethno-racial classification 


are summarized in Table 3.2.15? 


The broadest classification is along racial lines, with 
five different races existing. The next level adds a geographical marker (e.g., African or 
European) to the race distinction. The major culture area adds an additional physiological 
characteristic, mainly driven by skin pigmentation. The first three characters of the code 
are thus driven by phenotypical differences. Local races are characterized as a “culture 
area, local breeding population/reproductive isolate and genetically distinct population” 
(Barrett et al., 2001, V.II, p. 19). To differentiate between larger ethno-racial families and 
to characterize distinct ethnic groups or ‘microraces’, a final character is assigned as an 


identifier. On the global scale, the data contains 393 such ethno-racial families.!9?? 


E-L-Code Description T a0 sE 
A Race 1 5 0.01 
AU Geographical race 2 13 0.21 
AUG Major culture area 3 18 0.59 
AUG-03 Local race 4 72 0.88 
AUG-03-b Ethno-racial family 5 393 1.00 


Table 3.2: Ethno-racial group and similarity classification according to Barrett et al. (2001) 


For the ethno-racial classification, Barrett et al. (2001) do not clearly develop a similarity 
measure, instead measuring the distance on integer values. The different similarity levels 
(sE) are calculated with the same decrease in slope of the similarity values being found 
as that of the language characteristic.!?* 

Taking the same two groups in Afghanistan and comparing their ethno-racial classifi- 
cation, allows one to derive their similarity value of this characteristic. Accordingly, the 
Persians (CNT-24-f) and Southern Pathans (CNT-24-a) belong to one ethno-racial family 


and are eventually assigned a mutual similarity value SE of 0.88. 


3.3.3 Religious classification 


Religion is undoubtedly a major factor in shaping cultural habits and practices. The 
existence of different religions is often seen as an important reason for conflicts or general 
misunderstandings between different groups.!?®” Religious identification is in a certain 


way, an especially potent, but easily implemented instrument to expand ones political 


152 Whenever it is not the unique contribution of Barrett et al. (2001), the ethno-racial classification 
closely follows the Encyclopedia Britannica. 

I53Barrett et al. (2001) caution that these racial classification only act as a mere indicator as there “exist 
almost imperceptible gradations of genetic character from one group of people to the next" (Barrett et al., 
2001, V.II, p. 15). In general, this allows for mixtures between the outlined races. 

154Therefore the values of sE clearly differ, because only five levels are assigned for the ethno-racial 
classification, instead of seven, as is the case for language. 

155See, for example, Garcia-Montalvo and Reynal-Querol (2003) for the increased incidence of conflicts 
and de Groot (2009) for its spillover effects between neighboring states. For a more general discussion on 
the effect of religious beliefs on economic growth, see Barro and McCleary (2003). 


Philipp Kolo - 978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


68 Chapter 3. Measuring Ethnic Diversity 


power through mobilizing one’s followers. Religious inspiration may then be used to 
trade loyal following in this life, for rewards in an afterlife. The commonalty of religion, 
however, can also be a major driver of trust, enhancing trade between nations with the 
same denomination (Guiso et al., 2009). This underlines the importance of this specific 
characteristic in assessing the differences between groups. 

The major problem with religion is the assessment of their differences. How to treat 
the differences between different denominations, i.e., between Catholics and Protestants, 
or between Shias and Sunnis, is quite hard to answer. One could try to pursue the 
same method as that of language and race to assess mutual commonalties. For religion, 
one could rely on shared festivities, common holy books, common saints/prophets, tra- 
ditions or values (e.g., mercy). However, there is no known source offering a discussion 
of this, let alone a structured assessment of the religions of the world. The WCE lists 
14 major religions in the data: Agnostics, Buddhists, Chinese folk-religionists, Christians, 
Confucianists, Daoists, Ethnoreligionists, Hindus, Jews, Muslims, New religionists, Sikhs, 
Spiritists and Zoroastrians. This essay follows the approach that Bossert et al. (2011) 
applied in their study. For their partition along ethnic lines, they apply a purely categor- 
ical assessment, i.e., the mutual similarity values are either one or zero.!?® This approach 


should be adjusted as better data become available. 


3.3.4 Other socioeconomic aspects 


An interesting idea championed by Bossert et al. (2011) is that for the distance people feel 
between each other, not only does their ethnicity play a role, but also their similarities 
in other dimensions. Bossert et al. (2011) use educational and income similarities in 
addition to ethnic diversity, arguing that these variables are relevant for a ‘felt’ distance 
between individuals or groups.!?" Bossert et al. (2011) conclude that in states where one 
finds economic homogeneity, ethnic diversity might be less important than in economically 
more heterogeneous states, where both show comparable levels of ELF. 

As for this essay, one faces two problems. Most socioeconomic variables are not avail- 
able to the same level of granularity as the data used here, and data might not be matched 
to the ethnic groups. The more serious problem is that most economic literature finds a 
significant impact of ethnicity on various socioeconomic variables. Additionally, in many 
countries, the wealth or education stratification is closely linked to ethnic descent. Thus, 
with a high certainty there exists endogeneity of these socioeconomic variables with regard 
to ethnicity.!?® As this cannot be ruled out — and there is no adequate data to match the 
level of detail for ethnicity employed hereafter — further analysis into this aspect is not 


pursued. 


156 Guiso et al. (2009) use the same approach but with a slightly smaller amount of denominations. 

157Tn this regard, Bjørnskov (2008) points toward social trust and income inequalities. Another interesting 
approach for the US is that of Lind (2007). He tries to assess the inter-group distance through measuring 
differences in stated preferences on policy questions. 

158The same might be true for religion and languages, or even dialects. 
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3.4 Data description and comparison with other sources 


There are various sources for religious, ethnic and language data that are widely used 
in the literature. Besides the wide range of ethno-linguistic groups in the Atlas Narodov 
Mira (Bruk, 1964), Alesina et al. (2003) mainly use data from the Encyclopedia Britannica 
(Encyclopedia Britannica, 2007) and from the CIA World Fact Book (CIA, 2011) for their 
data on ethnicity. For languages, the Ethnologue project (Lewis, 2009) offers very detailed 
data of nearly 7,000 languages. Finally, L'Etat des Religions dans le Monde (Clévenot, 
1987) offers very exhaustive data on religious affiliation for a wide range of countries.!?? 
All these sources have their advantages and are certainly applicable for the intention of 
the respective authors. They, however, lack an important aspect, which is relevant for the 
analysis here. To build the similarity matrix based on all three traits (language, ethno- 
racial, religion), each group needs to be defined in accordance with all three of them. This 
is not possible with the above sources as the groups found in the sources vary depending 
on the defining criteria. 

The source offering the required data is the World Christian Encyclopaedia (Barrett 
et al., 2001).19? Tt contains data for over 12,000 groups in 210 countries, classified according 
161 


to language, ethno-racial group and religion. 'The data are based on various sources 


including official reports, national censuses, statistical questionnaires, field surveys and 


interviews. as well as several other published and unpublished sources. The level of 
detail and the vast coverage of countries is a strong advantage of this source. The data on 
languages and ethno-racial affiliation are widely used.!6? Due to the Christian background 
of the publishing institutions, one could argue (at least for the data on religion), that the 
numbers might be biased. Their very detailed assessment of Christian denomination, 
however, is an indication of a real interest to survey Christianity, drawing an unbiased 


picture of their faith.!6? The high granularity of data might still raise some questions 


159 Akdede (2010) gives a good overview of the data sources used in a broad set of influential articles and 
discusses their differences. 

160 For all calculations the online version, The World Christian Database (Johnson, 2010), is used. It 
reflects the data in the printed version of Barrett et al. (2001) but includes significant updates and refers 
to the 2005 — 2010 time period. 

161m total, over 13,500 groups for 239 countries are included in the data. Groups that differ only through 
dialects or, in some cases, geographical specifics, like, for example, the Bedouin tribes in Algeria, were 
excluded. Additionally, very small islands and constituencies with an unclear legal status (e.g., Western 
Sahara) were excluded. 

162566. for example, Annett (2001), Barro (1999), Barro and McCleary (2003), Collier and Hoeffler (2002, 
2004), Collier et al. (2004), Garcia-Montalvo and Reynal-Querol (2005a, 2008, 2010), Loh and Harmon 
(2005), or Okediji (2005). 

163 Additionally, Barrett et al. (2001) explicitly mention the United Nations' Universal Declaration of 
Human Rights in their preface, which grants the freedom to choose one's religion, including not having a 
religion at all. De Groot (2009) uses a similar, unorthodox evangelical source, the Joshua Project (2007). 
He also concludes that the “religious fervency with which this organization collects data works in our 
advantage" (de Groot, 2009, p. 14). Collier and Hoeffler (2002, 2004) and Collier et al. (2004) used it for 
their index on religious fractionalization. However, Garcia-Montalvo and Reynal-Querol (2005a) discuss 
some bias towards Christianity at the expense of Animist cults in Latin American countries. Although 
there is no evidence of a general bias in religious affiliation, it can't be ruled out completely. 
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about its accuracy. To test the robustness of the base data, two additional data sets with 
some noise based on a normal randomization are created. Additionally, the consistency of 
the data was tested if very small groups in the data were excluded. For both robustness 


checks, no significant deviation from the results employing the base data set occur.!6* 


Below, the most granular group data is used to offer the best possibility of endogenous 
group formation. Although data at the individual level is not available, this very granular 
data is very close to the desired approach outlined earlier. Table 3.3 gives an overview of 
the data, which is structured according to Alesina et al. (2003) and Fearon (2003). 


The WCE data clearly show much more groups. Alesina et al. (2003) have, on average, 
less than six groups per country. While 59 groups are counted in the present data set, on 
average. Besides the higher number of groups in general, the pattern of fractionalization 
across the regions is quite similar, with one exception. In contrast to the previous sources, 
this data show that most groups are located in Asia.'®° This is nearly exclusively driven 
by three countries that contribute half of all groups in this region: Papua New Guinea 
with 884 groups, Indonesia 762 and India 428.166 Excluding these three countries, Sub- 
Saharan Africa is again the region with the most fragmented countries.!9^ This becomes 
even clearer when one compares the other figures in Table 3.3. The average population 
share of the largest group is only 39% of the population’s total in Sub-Saharan Africa, 
whereas it is at least 50% in all other regions. Also, the number of countries that have a 


majority group of 50% is significantly lower. 


Source | Obs. Mean Std. Dev. Min. Max. 
ANM 169 0.458 0.273 0.000 0.984 

Alesina 186 0.440 0.257 0.000 0.930 

ELF Annett 144 0.479 0.275 0.010 0.950 
Fearon 153 0.471 0.270 0.002 0.953 

WCE 210 0.563 0.270 0.019 0.982 


Table 3.4: Main statistical characteristics of ELF values for different sources 


The higher amount of small groups also has an effect on the ELF values based on the 
WCE data, reflected in a noticeably higher mean value. A higher number of groups will 
increase the ELF index by design.!6® Table 3.4 confirms this by showing the summary 


statistics of the ELF values for the various sources described earlier. 


164 For more details on these robustness checks, see Appendix C.1.1. 

165 he Asian region includes the Pacific countries and islands. 

166 Although this number seems to be high, it is very much in line with other very detailed sources. Lewis 
(2009) lists 860 languages for Papua New Guinea, over 10% of the world's total in his data set. 

167 Excluding these three countries, the average number of groups per country in Asia would only amount 
to 56. 

168 The theoretical attributes of the ELF and POL are nicely met by the WCE data. Figure C.6 of 
Appendix C.1.3 shows the increasing ELF values in conjunction with a rising number of groups within a 
country. 
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3.5 DELF operationalization 


For the construction of the new composite distance adjusted ethno-linguistic fractional- 
ization index (DELF), two major, partly interconnected, questions arise. The first is, 
whether the single components are redundant when compared to each other. The second 
is the assignment of weights and the way of combining the single characteristics. 

Based on theoretical considerations, no single characteristic out of the three is deemed 
to be superior or more sound than the others, with all of them seeming to be of equal 
relevance.!6? For the same reason, Okediji (2005) proposes including ethnic differentiation 
alongside racial and religious characteristics.‘ Finally, one can argue that the distance 
between the groups increases, if more differences are in place, which would be in line with 
the cumulative statement of de Groot (2009).17! 

The most common approach when incorporating different characteristics into a com- 
bined index is to assign equal weights to all of its components.!"? Following this approach, 
the DELF is calculated according to Equation (3.7) as: 


K 


K 
DELF=1-),), pp pi Skı (3.8) 
k=11=1 


where the combined ŝ+ is the equally weighted average of the similarity values of each 


ethnicity characteristic. 


1 


5 [ sh + 56 + 56 | (3.9) 


EI = 


where 5}, 5% and sf are the respective similarity values for the language, ethno-racial 


173 


and religious classification. The single characteristic DELF's are equally calculated 


169566. for example, Chandra and Wilkinson (2008) and Barrett et al. (2001). Hofstede (2000) concludes 
similarly that “the world population has diversified in three ways: in genes, in languages, and in cultures” 
(Hofstede, 2000, p. 3) 

170 Oked iji (2005) constructs his social diversity index based on the complementary nature of the three 
characteristics and also uses WCE data. However, he does not take into account the mutual (dis)similarities 
between the groups. 

171 One could argue that by design, the language and ethno-racial classification is not without overlaps. 
This is why one should weight their sum less. On the other hand, the religious classification is less accurate 
and would, in contrast, argue for a lower weighting of this characteristic. If there is no strong reason for 
deviating from the equal weighting, Haq (2006) argues strongly for this principle. 

172The most well-known index calculated utilizing this approach is the UNDP’s Human Development 
Index (HDI). More recent examples are the SIGI index on gender equality (Branisa et al., 2009) or the 3P 
index on trafficking policies (Cho et al., 2011). For an analysis of different operationalization strategies for 
a broad set of composite development indicators, see Booysen (2002). 

173The main focus of this essay is to assess the diversity of a country, which is well reflected by the DELF. 
However, from the discussion above, one can easily apply the similarity values 3;; to an adapted version 
of the polarization index found in Equation (3.2). This would then transform to a distance adjusted POL 
index with: D-POL — m a p? pi: Sy; (Esteban and Ray, 1994). For further theoretical discussions 
on this kind of index, see Esteban and Ray (2008) and Esteban and Ray (2011). For rare examples of 
an empirical application of this index, see Desmet et al. (2009), Esteban et al. (2010), Esteban and Ray 
(2011) and Esteban and Mayoral (2011). The data for the D-POL index based on the WCE data can be 
obtained from the author upon request. 
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using Equation (3.9). Instead of the composite similarity measure (3,1) the characteristics 
specific similarity values (sb, E sft) are used. To decide on the redundancy of the 
composite index and its components, McGillivray and White (1993) propose two thresholds 
of correlation values between the components: 0.90 and 0.70.14 The Spearman's rank 
correlations of the DELF values based on the components (labeled with a respective 
subscript for (L)anguage, (E)thno-culture and (R)eligion) and the composite DELF index 


are shown in Table 3.5.17 


| DELF DELF, | DELFg DELFR 


DELF 1 

DELF, 0.904 1 

DELFg 0.714 0.537 1 

DELFg 0.665 0.529 0.195 1 


Table 3.5: Rank correlation for the composite DELF and its components 


The correlations between the single components are no higher than 0.54, falling clearly 
below both thresholds. Thus, any form of double counting by using collinear indicators can 
be neglected. As the composite index is partly matched to its components, the resulting 
correlations are naturally higher. By correlating the components with reduced forms of 
the DELF (by excluding the respective component), most correlations again fall below 
both thresholds (McGillivray and White, 1993; Ogwang and Abdou, 2003).!”° In addition 
to the overall correlations, Noorbakhsh (1998) proposes to split the total observations into 
different groups. A high correlation overall might hide differences within groups, e.g., split 
into quintiles. Table 3.6 shows the correlations seen in Table 3.5, split between equally 


sized quintiles. 


Quintiles 
All obs. 1 2 3 4 5 
DELF | DELF DELF DELF DELF DELF 
DELF, 0.904* 0.282 0.483* 0.401* 0.556* 0.814* 
DELFg 0.714* 0.056 0.156 0.050 0.141 0.815* 
DELFg 0.665* 0.569* 0.142 0.004 0.276 0.972* 


* indicate rank correlations that are significant at the 5% level 


Table 3.6: Rank correlation for equally sized quintiles (according to their DELF values) 


Indeed this shows that the higher correlations between the components and the com- 


posite DELF vanish completely, or are at least far below both thresholds, except for the 


174 Cahill (2005), McGillivray and Noorbakhsh (2004), Branisa et al. (2009) and Cho et al. (2011) subse- 
quently used this decision rule. 

175 Because all conditions are fulfilled, Pearson’s correlation coefficients can also be used. The results are 
comparable throughout, but slightly lower. As, in the following, the focus is mainly on ranking comparison, 
Spearman’s rank correlations are consequently used. 

176The correlation between DELFT, and the reduced DELF by excluding DELF, shows a value of 
0.69. The respective values for excluding DELFg and DELFR are 0.48 and 0.43, all falling below both 
thresholds. 
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fifth quintile. In light of the above discussion, it is reasonable to assume that all com- 
ponents are individually relevant, they indeed measure different characteristics, and the 
combination of all three is a valid way to cover the complexities of ethnic diversity. 

To come up with the composite DELF, an equal weighting scheme has been applied 
to date. Following an extensive critique on the rather simplistic equal weighting of com- 
posite indices (Cahill, 2005; McGillivray and White, 1993), the call for a more elaborate 
weighting scheme, or at least a better foundation, is understandable. One approach 
widely discussed is the principal component analysis (PCA).!5 Principal components are 
calculated as linear combinations of the original variables (the single characteristic DELF 
values in this case) in a way of explaining the largest part of its variation. The first prin- 
cipal component explains most of the variance, followed by the second and third principal 
component. In doing so, principal component analysis transforms correlated variables 
into uncorrelated ones and all principal components are orthogonal. The assigned loading 
factors can then be used to weight the sub-indices.!79 

The very high correlation of 0.999 between the DELF and the index based on PCA 
calculations (DELFpcA) is seen in the upper part of Table 3.7. This suggests that one 
can resign from using the more complex weighting schemes and it underlines that none of 


the components dominates the other components in a problematic way.!90 


DELF DELFpcA DELFaeo DELFp. ANM Alesina Annett 
DELF 1 
3 DELFPpcAa | 0.999 1 
a DELFGeo 0.963 0.963 1 
DELFpe 0.994 0.994 0.959 1 
ANM 0.698 0.697 0.707 0.736 1 
& Alesina 0.628 0.630 0.632 0.662 0.800 1 
= Annett 0.630 0.630 0.651 0.671 0.874 0.883 1 
Fearon 0.607 0.606 0.626 0.621 0.748 0.817 0.795 


Table 3.7: Rank correlation matrix for differently weighted DELF values and the most common 
ELF indices 


Having discussed the possible redundancy of the components and ways to assign their 


weights, there are two ways to aggregate the components; using the arithmetic, or the 


177 Chowdhury and Squire (2006) show that the vast majority of scholars still opt for the equally weighted 
average regarding aggregated development indices, despite ongoing discussions. For the HDI, Nguefack- 
Tsague et al. (2011) also provide a statistical reinforcement of the equal weighting scheme. An additional 
problem often raised is the implicit weighting due to different scales of the sub-indices (McGillivray and 
Noorbakhsh, 2004; Noorbakhsh, 1998). Through construction of the sub-indices, this problem does not 
apply to the DELF. 

178For a discussion and its application, mainly to the HDI, see Jolliffe (1973), Ram (1982), Ogwang 
(1994), Noorbakhsh (1998) or Ogwang and Abdou (2003). 

179For the results of the PCA and further details, see Appendix C.2. 

180 ^ dditionally, the variances of the sub-indices are rather similar. So, none of the sub-indices would 
significantly bias the equally weighted index. For details on key statistical attributes of the single sub- 
indices, see Table 3.8. 
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geometric mean.!*! 


Using a geometric mean does ‘penalize’ high dissimilarity in one of 
the components, however. This is often used in composite indices on various inequality 
measures, e.g., poverty, where the direct compensation of one component through another 
is not desired.!?? Two individuals from the same ethno-racial and language backgrounds, 
who adhere to different religions, would be completely different in the case of a geometric 
mean because the religious component would be zero.5? That a certain similarity still 
prevails between both individuals/groups is obvious. Thus, for the application here, a 
form of compensation between components seems reasonable. In connection with the 
discussion above, the interpretation of the cumulative nature of the characteristics is more 
perspicuous and, additionally, argues in favor of an arithmetic mean. Due to these very 
different attributes, it is not surprising that the DELFc.. has a lower, yet still very high 


correlation to all the other DELF values. 


As an alternative, the introduction of a certain non-linearity of compensation between 
characteristics might be reasonable. 'This is, for example, promoted by Branisa et al. 
(2009). To allow for a certain compensation, one squares the components before the 
calculation of the arithmetic mean. This leads to an adjusted value of DELFp.. In line 
with Nardo et al. (2005), in this approach the weights are interpreted as trade-offs and 


not as importance coefficients. 54 


Finally, the DELF index should contain different information than other indices that 
try to measure ethnic fragmentation or diversity. Thus, the redundancy considerations 
regarding the components can be applied as a comparison to existing ELF indices. The 
results are found in the lower part of Table 3.7. All rank correlations between the most 
common ELF indices and the new DELF fall below both redundancy thresholds.!5? Al- 
though already alluded to the theoretical discussion, where it was apparent that both 
indices measure different things (fragmentation versus diversity), the statistical results 


provide additional confirmation. 


181 An additional aggregation for the DELFpo index is not necessary because, by construction, the 
distance vector of the first principal components contains the weights and aggregation implicitly. 

182The HDI just recently switched from an arithmetic mean to a geometric one. To advance a country’s 
development it now needs to advance much more equally across the sub-indices than before, where one 
could compensate for one index with another. A geometric mean for an index would also imply a clear 
assignment of both a bad and good state for the values of zero and one. This is possible for poverty 
and development indices but not for the DELF, which describes a state between two extremes without 
valuation. 

183 Collier and Hoeffler (2002), Collier and Hoeffler (2004) and Collier et al. (2004) use a multiplicative 
combination of the ethnic and religious fractionalization measure to assess ‘social fractionalization’. To 
avoid the dominance of one characteristic, where two groups are completely different, they add the index 
which is the greater to the product of both indices. 

184 Thus, an individual can reduce the distance between another individual that does not adhere to the 
same religion by learning his language. For further theoretical discussions on weighting and differences 
between compensatory and non-compensatory approaches, see Munda and Nardo (2005). Branisa et al. 
(2009) offer a functional operationalization. 


155Note that the number of observations varies across the correlation values with the ELF indices due to 
their more limited observations. 
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The arithmetical average between the single characteristics is therefore the easiest way 
to operationalize the composite DELF index. Furthermore, it has the compensatory 
attributes between the characteristics that reflects their complementarity. This is not 
given by using the geometric mean, for example. By using the part compensation method 
and principal components, comparably adequate results are found to those of the simple 
arithmetic mean. As their correlation is rather high, the method used here follows the 


principle of keeping it as simple as possible.!56 


3.6 Results 


For each country, a similarity matrix is calculated, containing all §,; for the weighting of 
mutual group similarities. Tables C.2 and C.3 of Appendix C.2 detail the general similarity 
matrix calculation. The group similarity calculations are comparable to the ones within a 


country and for the difference between countries. 


3.6.1 Diversity measure within countries 


The size of the respective K x K matrices for each country is defined by the number of 


groups found in it, ranging from 3 to 884. 
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Figure 3.2: Combined and single characteristic DELF values against ELF values. 


186 or further details on all weighting schemes, see Appendix C.2. A detailed discussion of the superiority 
of the equal weighting scheme is found in McGillivray and Noorbakhsh (2004), who conclude that more 
elaborate weighting schemes *produce values which are generally indistinguishable from values of the 
equally weights index” (McGillivray and Noorbakhsh, 2004, p. 15). Comparably, de Groot (2009) uses the 
same approach in his ethno-linguistic affinity index. 
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To make the differences between the ELF and DELF values clear, Figure 3.2 shows 
the influence of the various characteristics.!5" By adjusting for the language differences 
only, reduces the values by less than when all three characteristics are considered. The 
most influential changes emerge if religion is taken into account, since in many countries a 
majority religion is present, which acts as a unifying characteristic. The combined DELF, 
weighting all three characteristics, yields more consistent values, which is confirmed in 
Table 3.8. The standard deviation of the composite DELF is considerably lower than 
those of the decomposed indices. 

Religious and language homogeneity, in particular, are spread differently across regions. 
This is why the adjustments also vary significantly between regions. In Latin- America, 55 
Spanish is the dominant language, although there are different ethno-racial and/or religious 
groups. The language similarities add to a higher affinity between the groups and, in turn, 
lower the DELF values. Table 3.9 summarizes the mean values for different ELF and 
DELF specifications across regions. Additionally, it compares the average ranks of the 
countries in the respective groups. A rank of one is assigned to the most heterogeneous 
countries, i.e., the countries with the highest ELF or DELF values. Comparing both 
ranks gives a good indication of how large the adjustments in the DELF calculation are 


compared to the standard ELF values. 


Index | Observations Mean Std. Dev. Min. Max. 


ELF 210 0.563 0.270 0.019 — 0.982 
DELF 210 0.252 0.157 0.006 — 0.636 
DELF, 210 0.353 0.243 0.008 0.942 
DELFg 210 0.255 0.176 0.002 0.708 
DELFg 210 0.148 0.188 0.000 — 0.648 


Table 3.8: Main statistical characteristics of DELF values, decomposed for all ethnicity charac- 
teristics 


Sub-Saharan Africa (SSA) demonstrates a much higher value when measured by the ELF 
compared to the DELF, resulting in a negative rank delta. As seen earlier, this re- 
gion includes countries with the highest number of groups, mirrored by high ELF values. 
However, if one takes the similarity between the groups into account, the ranks decrease. 
Eastern Europe, in contrast, shows much more diversity when considering the DELF value 
rather than the ELF value. 

More interesting is the decomposition of the DELF into its single characteristics. For 
the language characteristic, Latin America hosts the most homogeneous countries, whereas 
Sub-Saharan Africa again shows the most heterogeneous ones. Taking into account only 
the ethno-racial aspect, Latin America shows the highest diversity. This might come from 
the interbreeding of the native Indian population with the high number of descendants 


from the Western colonial powers and the resulting Mestizo progeny. The region with 


1*7 Both indices are based on WCE data. 
188Includes the Caribbean. 
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the most homogeneous countries in this regard is Eastern Europe, a region where outside 
powers have interfered less. The religious characteristic again demonstrates the expected 
distribution. Sub-Saharan Africa has the most religiously heterogeneous countries and 
Western and Latin American countries, with high numbers of Christians, host the most 
homogeneous ones. Not surprisingly, the Middle East and Northern African (MENA) 
countries also show values indicating rather homogeneous religious characteristics, which 
is not surprising considering the high proportion of Muslims in these areas. Most countries 
that have a majority religion, i.e., more than 60% of the population either adhere to 
Christianity (133 countries) or to Islam (43 countries), exhibit rather low religious DELF 
values. For all other countries, where there is either no majority religion or it is made 
up of another denomination, show significantly higher religious DELF values. Also, their 
average overall DELF rank is substantially higher than when only taking the number of 


groups in the ELF value into account. 


Mean values 


Obs. ELF DELF DELF, DELFg DELFg Rank Rank Delta 

ELF DELF Rank 

Asia 40 0.608 0.290 0.435 0240 0.194 933 90.8 2.5 
E. Europe 29 0389 0.197 0.261 — 0.20 0.126 145.9 25.0 20.8 
L. America 38 0.500 0.227 0.220 0.386 0.075 1213 14.5 6.8 
MENA 21 0.558 0.249 0.358 — 0.275 0.114 — 1081 07.0 1.2 
SSA 49 0.741 0.319 0.490 0219 0.248 62.6 81.2  -1&6 
W. Count. 33 0.465 0.184 0.279 0.206 0.066 128.7 30.9 -23 
World 210 0563 0.252 0353 0.255 0.148 = = = 
Muslim 43 0.571 0.262 0.389 0271 0.127 1056 00.7 4.9 
Christian 133 0.519 0.208 0.299 0.251 0.076 115.7 21.2 -5.7 
Other 34 0.729 0.407 0.519 0.249 0.454 65.6 50.1 15.5 


Table 3.9: Mean ELF and DELF values and ranks for all regions and countries with main 
majority religions 


The single country perspective shows even more considerable adjustments. The ELF and 
DELF values of each country are listed in Table C.7 of Appendix C.3. The countries are 
ordered according to their ELF values in descending order, from the most heterogeneous 
country to the most homogeneous country. The third column depicts their corresponding 
DELF values and DELF ranks. The difference between the ELF and DELF ranks is 
shown in column four. The next column outlines the DELF values, decomposed for each 
characteristic, which helps to better illustrate the adjustments.'5? An adjustment of over 
40 places is seen by half of the 10 most diverse countries. Looking at the lower end, one sees 
only marginal adjustments, as expected. The 15 most homogeneous countries are, with 
three exceptions, the same for both indices. For the other countries, however, significant 


adjustments are found. For example, Zambia, the Republic of Congo, Zimbabwe, Angola 


189 From Figure C.7 of Appendix C.1.3, one can see that the adjustments will tend to be more significant 
for higher values of ELF than for lower ones, where both indices are much closer. This is clearly visible 
for the higher ELF values at the top of Table C.7. 
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and Italy, which are treated as much more homogeneous by the DELF compared to the 
ELF, show difference in ranking of more than 100 places are. Nevertheless, one also finds 
adjustments in the opposite direction, i.e., countries that have a higher diversity rank based 
on DELF values. The countries with the most significant adjustments in this regard — all 
more than sixty places — are Kazakhstan, Bahrain, Macedonia, Lebanon, Sudan and the 
Russian Federation. These upward changes are mainly driven by relatively high language 


diversity. 


3.6.2 Similarity measure between countries 


To date, most authors have focused on the assessment of ethnicity within a country, as 
has this essay. This has also been the case in analyzing a country's growth or conflict 
incidence. De Groot (2009) expands upon this and proposes his index of ethno-linguistic 
affinity (ELA) to measure the similarities between two neighboring countries. He shows 
that conflict spillovers are more likely between contiguous countries sharing stronger ethnic 
similarities. The extended calculation for the DELF between countries is nearly identical 
to Equation 3.77, and is defined through: 
K M 


DELE;—1— $5 Pik Pim êkm (3.10) 
k=1m=1 


where country i hosts groups k = 1,..., K, and country j groups m = 1,..., M, respectively. 
The distance between the two groups k and m is given through 5;m. The result is the 
expected dissimilarity between two individuals randomly drawn from each country. 

'The 210 countries analyzed here give a matrix containing over 150 million similarity 
values and nearly 44,000 dyadic relations between countries.?? Due to the amount of 
country-pairs, only a discussion of averages and some tuples with the highest discrepancy 
is offered here.?! Naturally, all DELF values are much higher than those for individual 
countries. Table C.8 of Appendix C.3 lists the mutually most similar and dissimilar coun- 
tries at the single country level.9? Many of the mutually most similar countries come from 
the MENA region. The religious homogeneity of this region plays an important role in 
their overall similarity level. It is not surprising that the most dissimilar pairs are matches 


between Asian and African countries. Except for some minority migrant groups, one does 


This significantly exceeds the 2,809 dyadic relations offered by de Groot (2009) for the 53 African 


countries. 
191q 


The complete data set can be received upon request. 

1927) general the interpretation of the DELF value between countries ranging between zero and one is 
comparable to the case of DELF values within countries. Two countries that consist of groups that share 
not a single characteristic show a mutual DELF value of one, being completly different. Lower values of 
DELF correctly indicate countries that share more characteristics and thus are more 'similar. However, 
the theoretical country setup maximizing the similarity between two countries (minimizing the DELF 
value) deviates in its limit from the generally understood meaning of the word 'similar. This is discussed 
in more detail in Appendix C.2.6. I would like to thank Walter Zucchini for this important comment. 
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not find many shared ethnic characteristics between these countries and all their values 
are close to one. 

A regional aggregation also offers come interesting insights. For the calculation of 
the regional averages, the DELF values between countries are adjusted for the different 
population sizes of the respective country pairs.?? Table 3.10 summarizes the regional 


and global averages. 


Regional DELF Country pairs 


Asia 0.719 1,600 
Eastern Europe 0.479 841 
Latin America 0.340 1,444 
North Africa & Middle East 0.430 441 
Sub-Saharan Africa 0.643 2,401 
Western Countries 0.572 1,089 
World 0.841 44,100 


Table 3.10: DELF average by main geographical regions 


The global cultural diversity measured by the DELF displays an average of 0.84. Asia 
exhibits the highest diversity level compared to all other regions. Thus, from a regional 
perspective, Asia seems to be the most diverse region, and not SSA.!% Latin America, in 
contrast, displays the least interregional diversity. 

The regional level of diversity plays an important role in the European Union (EU). The 
success of European integration is often questioned by the high level of cultural diversity. 
This was debated before the last enlargement in particular, when the EU grew from 15 to 
25 and shortly after to 27 member states. It will eventually lead to even more controversial 
debates regarding future enlargement plans. With the above approach, the developments 
in the level of diversity through language, ethno-racial, and religious characteristics, can 
easily be traced. 

Figure 3.3 shows the diversity level of the EU for each wave of enlargement.!” The 
predecessor of today's EU was initiated in 1952, including Belgium, France, Germany, 
Italy, Luxembourg and the Netherlands. This ‘core Europe’, which it is often referred 
to, displayed a regional DELF value of 0.37. The next two enlargement waves added 
nearly 2596 to the total population. However, these countries were not overly different 
from the existing group and were internally rather homogeneous. Hence, the DELF only 
slightly increased. The addition of Portugal and Spain in 1986, two populous and very 


homogeneous countries, slightly decreased the overall level of European diversity, whereas 


1938or the weighting, population data averages for 2005-2010 from the World Development Indicators 
World Bank (2011) were used. For more details on how regional averages are calculated and the differences 
in the calculation of DELF values between countries, see Appendix C.2.7. 
194Note that from the single country perspective, SSA still has the countries with the highest internal 
heterogeneity. This is an indication that the drawing up of borders in Asia proceeded more ‘endogenously’ 
than the method used in SSA by the colonial powers. 
195 For more details on the different waves of enlargement in the EU, and the respective diversity levels, 
see Table C.9 of Appendix C.3. 
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Figure 3.3: Average DELF values of the EU per enlargement wave 


the huge enlargement of 10 countries in 2004, and of two more in 2007, again increased 
the DELF level significantly.?9 Looking at potential future enlargements, the admission 
of mainly Balkan states, as well as Iceland (EU--B), would not change the status quo 
greatly. The highest increase in diversity within the EU would result from admitting 
Turkey (EU--T). The increased cultural diversity Turkey would bring to the EU can't 
be judged as good or bad, per se — however, it offers an easy target for exploitation of 
these differences and political agitation. This was already the case during earlier waves of 
enlargement which only displayed marginal increases in the EU's diversity. The increase 
Turkey would bring, as stated, would be far greater, thus the potential for exploitation 


and political agitation could be far greater. 


Finally, the DELF values between countries are compared with the most widely used 
measure of cultural distance between countries, its genetic distance. By matching these 
with the detailed data on genetic diversity compiled by Spolaore and Wacziarg (2009), 
yields only a very limited correlation (Table 3.11).!°7 The rank correlation of genetic dis- 
tance and the composite DELF is only 0.25, and thus fail to meet both of the redundancy 


198 


thresholds discussed above. '* This comparison underlines that the genetic distance data 


is hardly a good proxy for the ‘cultural’ differences between countries. 


1960ne important caveat applies for this. As essay 2 outlined, cultural heterogeneity levels are subject 
to change. As the underlying data for the DELF calculation is dated for the years 2005-2010, using it 
for time frames of over 50 years ago will lead to distorted values. Thus, the DELF values for the EU 
enlargement for the earlier years can only be taken as an indication. The changing DELF values are only 
attributable to compositional changes of the European Union and not to expectable changes over time. 


197Spolaore and Wacziarg (2009) construct two measures of genetic relatedness between countries. One 
is based only on the genetic distances between the plurality ethnic groups of each country. The second is 
a measure of weighted genetic distance of all groups. The latter construction is more comparable to the 
one employed in this essay. 

198 As expected from the characteristic definition, the highest correlation of the genetic data is with the 
ethno-racial DELF values at 0.7. Both are correlated but still seem to measure different things. 
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DELF DELF; DELFg DELFg 

1. 
DELF 

43890 

0.566 1 
DELF, 

43890 43890 

.48€ .636 1 

DELF;, 0.489 0.636 

43890 43890 43890 

0.899 0.363 0.193 1 
DELFg 

43890 43890 43890 43890 
Genetic 0.245 0.484 0.697 0.018 
Distance 30800 30800 30800 30800 


Table 3.11: Rank correlations between DELF, its sub-indices and genetic distance data (obser- 
vations in italics) 


3.7 Conclusion 


Taking the mutual (dis)similarities between ethnic groups into account, the new DELF 
index covers a new and very important aspect of ethnicity: its diversity. This additional 
aspect was ignored by the most commonly used measures of ethnicity. The DELF index 
for 210 countries shows considerable differences between countries and regions. The dif- 
ferences suggest that it indeed measures different aspects of ethnicity, which might have 
a contrasting effect on the socio-economic problems under investigation. 

Many current papers analyzing the role of ethnicity based on the ELF index can 
profit from taking the mutual (dis)similarities between individual groups into account. In 


countries, where ethnic groups show higher differences, it might be even more difficult to 


agree on public goods (e.g., infrastructure or social security systems), as has already been 
shown by Alesina et al. (1999). Caselli and Coleman (2008) discuss the importance of 
barriers between groups to prevent assimilation between them on the incidence of wars. 
This is exactly what Collier and Hoeffler (1998, 2004), Collier et al. (2009) and Fearon and 
Laitin (2003) try to find in their analyses. i.e., whether ethnic fragmentation increases the 
incidence of wars. Their results do not find a robust influence of ELF on conflict incidence. 
It might still be the case that there is a strong influence of ethnic diversity on conflicts, but 
the applied ELF index does not measure the appropriate aspect of ethnicity in order to 
prove this. Additionally, the possibility to analyze the single characteristic DELF for very 
specific questions offers new room for application. Akdede (2010), for example, shows the 
different implications of ethnic and religious fractionalization on democratic institutions. 

Research that leveraged genetic distances to assess the dissimilarity between countries 
should equally profit from employing the DELF between countries. It offers a much 
more comprehensive data set of ‘cultural’ affinity between nations. As de Groot (2009) 
concludes, it is not necessarily the geographical distance, often used in spatial economics, 
which is being applied to assess the influences one country might have on others. Nor does 


genetic distance really offer a satisfying alternative. The DELF values between countries 
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offer an excellent and valid extension of the analysis into spillover effects between countries. 
De Groot (2009) shows the role cultural affinity between neighboring countries plays in 
the spillover of conflicts. 

Trust is associated closely with more homogeneous and similar country setups. Genetic 
distance only covers trust in a very limited way. Trust is seldom hidden in the genetic code, 
evolving out of the interaction between individuals whose cultural backgrounds play an 
important role. Leveraging genetic distance is even more problematic in Spolaore and 
Wacziarg's (2009) analysis on the spillover effect of innovations and development between 
countries. Imitation and adaptation costs of innovations rely significantly more on the 
‘cultural’ barriers (different language, ethno-racial background and beliefs) than on the 
biological ones (genes). 


Nevertheless, there are some caveats that one cannot overlook. As the data source 


used is somewhat unique in its combination of all characteristics, only limited robustness 
checks with other sources on the combination of the characteristics are possible. Secondly, 
the weighting of the three sub-indices is debatable, as is the case for most composite index 
calculations. Here, the most general approach is used. For specific questions, different 
emphasis might be given to specific characteristics. The clear discussion and overview of 
the single sub-indices should encourage every researcher to do so. Finally, there might 
be country or region-specific characteristics influencing cultural diversity not covered in 
the (globally comparable) three characteristics treated in this essay. The caste system in 
India would be one example. Thus, for a country or region-specific analysis, the diversity 
data offered might have restricted relevance. Nevertheless, the approach discussed here 
can still be applied. 

In the above cases the DELF index should be more appropriate than the ELF index 
as it incorporates the fundamental concept of diversity. The extension to measure cultural 
dissimilarities between nations offers a good alternative to the applied genetic distance 
data. The broad foundation and the detailed new data set should be a call to critically 
review the usage of the ELF index and the genetic distance data. Additionally, it provides 


a starting point for new research on the specific role of the diversity of countries. 


199For an indication of how a common language increases trust and common identification in a case study 
for the US, see Chong et al. (2010). Falck et al. (2010) show that German cross-regional migration and 
economic exchange can be attributed to dialect similarities from the 19th century that remain today. 
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Chapter 4 


The Implications of Ethnic 
Diversity 


4.1 Introduction 


Recent literature on the role of ethnicity in socioeconomic contexts has strived to find 
different ways of measuring its various aspects. Most papers rely on an index covering 
the fragmentation (ELF) or the polarization (POL) of a country’s ethnic groups.?? In 
general, the ELF is an increasing function of the number of groups. A country with more, 
and thus smaller relative groups, is more fragmented. On the contrary, the POL measures 
the deviation from a situation of two equally sized groups. For such a setup, the POL 
reaches its maximum value and decreases afterwards. Already from their construction one 
can see that both measures cover different ethnic setups and are therefore supposed to 
explain different problems. The distance adjusted ethno-linguistic fractionalization index 
DELF now adds a third aspect, a country’s ethnic diversity. 

For many economic problems, it is not the pure quantity of (relative) groups which is 
of interest, but the difficulty of coordination or instrumentalization between them. This 
is crucially dependent on the differences between these groups and not only on their mere 
existence. Thus, one expects the DELF to exhibit a different performance on a range of 
economic questions compared to the ELF and POL. This chapter shows the applicability 
of the DELF index in different fields, selected as to cover a broad range of economic 
problems.?01 
The mere quantity of groups might demonstrate more divisions through which conflicts 


may ignite, arguing for the ELF index. To the contrary, in very heterogeneous countries, 


?00S66. for example, Mauro (1995), Easterly and Levine (1997), Collier (2001), Alesina et al. (2003) and 
Alesina and La Ferrara (2005) for different applications of the ELF concept and Duclos et al. (2004), 
Garcia-Montalvo and Reynal-Querol (2002, 2005b, 2010) and Ranis (2011) for the application of the POL 
index. 

201 Additionally, limited access to the underlying data of some relevant articles in the respective fields hin- 
dered a broader testing of the applicability of the index, therefore affecting the selection of the replications. 
Again, a sincere thank you to all authors who generously shared their data for this study. 
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with many small groups, coalition building to create a strong enough power base becomes 
more difficult. This is why others argue that it is the specific quantitative constellation 
of groups, i.e., their polarization (POL), that impacts the incidence of conflicts (Garcia- 
Montalvo and Reynal-Querol, 2005b). Caselli and Coleman (2008) point to obvious barri- 
ers between groups that are important regarding the decision to enter conflicts. If a group 
can easily exclude another (potentially the defeated group) from the resources gained (due 
to an obvious ethnic identifier existing between both groups), it raises the incentives to 
start a conflict. This would argue for an important role of the diversity measure DELF. 
Thus, it is difficult to decide a priori which ethnicity index better explains conflicts. 

A high ethnic fragmentation is associated with lower growth rates, mainly through its 
effect on other socioeconomic variables (e.g., corruption or public goods provision).20? The 
more groups that exist, the more visions regarding the realization of education, location 
and forms of infrastructure, or the design and extent of institutions, differ. Because every 
groups wants at least some of its wishes to be met, the government's difficulty in achieving 


a consensus, and the distribution of its available funds might indeed depend on the mere 


quantity of groups, reflected in a higher fragmentation (ELF). In contrast, the different 
backgrounds and experiences of a country's working population may be an asset to sustain 
more complementary production procedures and drive innovation. For this, not only the 
mere quantity of groups, but also their differences seem to be relevant. This potential 
might, however, only unfold in more developed countries.2°? Whether, and how the ELF 
or the DELF may impact on economic growth is, again, not completely clear. 

Trust is an important precondition for nearly any transaction. Different groups should 
equally influence the general trust level in a country. Bjørnskov (2008), however, finds no 
significant impact of fractionalization (ELF) on trust. This might be true due to the fact 
that the quantity of groups in a country are less relevant to the emergence of trust, than 
the differences between them (which only the DELF takes into account). Unfortunately 
this can't be confirmed by the following analyses. 

The DELF has an additional huge advantage, in that it can be used to assess the 
cultural differences between countries. For both the ELF and POL, this is not possible. 
'Thus far, bilateral differences between cultures were assessed by data based on quite 
limited differences (e.g., genetic distance) or a broad set of proxy variables that are often 
regionally bounded (e.g., mutual voting behavior at regional song contests). The global 
DELF data shall offer some escape from these limitations. Again, taking the level of trust 
as an important prerequisite for any economic activity, cultural diversity affects the level 


of positive opinions between countries (Disdier and Mayer, 2007). More specifically, the 


202The most prominent are Easterly and Levine (1997), Alesina et al. (2003) and Alesina and La Ferrara 
(2005). 

2030ne can, however, also argue that with a rising difference between the groups a consensus becomes 
even more difficult and thus a high diversity should also have a negative impact on growth. For these 
potentially different effects, the development level of a country seems to be especially crucial and will be 
further discussed in section 4.4. 
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DELF shows a significant impact on trade volumes in two analyses with an European 
focus. It substitutes a list of various cultural affinity proxies very well, thus paving the 
way to expand these trade analyses on to a global scale. 

The remainder of this chapter is organized as follows. Section 4.2 gives a short overview 
of the main indices commonly used and outlines the clear distinction between them and 
the DELF index. In section 4.3, the DELF is tested based on its implications for conflict 
incidence. The differing impact of ethnic diversity on economic growth is analyzed in 
section 4.4. Section 4.5 discusses the potential improvements of the diversity measure 
against the fractionalization measure for the level of trust within and between countries. 
Section 4.6 again uses DELF values between countries to identify its role on bilateral 
trade. Finally, section 4.7 summarizes the key findings, concludes, and gives an outlook 


for further research. 


4.2 Overview of relevant indices 


The most commonly used index to cover ethnicity in the economic context is the ethno- 
linguistic fractionalization index (ELF). It was first published for a broad range of countries 
by Taylor and Hudson (1972).?4^ The ELF is calculated as an Herfindahl-Hirschman 
concentration index: 

> 

ELF-1-M y, for all ic (1,..., K} (4.1) 

i=l 
where K is the number of groups i, and p; represents their relative group sizes. Its value 
moves between zero and one and represents the probability that two randomly selected 
individuals from a population come from different groups. A higher value indicates a more 
fragmented country, i.e., a country with a higher number of distinct ethnic groups. 

'The second prominent measure is an index of polarization, introduced by Garcia- 
Montalvo and Reynal-Querol (2002).2° Assessing the variation outside of an even 50/50 
split of two groups, Garcia-Montalvo and Reynal-Querol (2002) find that this index is 
a much better predictor for conflict incidence than the ELF measure. The polarization 
index (POL) is defined as: 


K (05—piV? 
POL=1- - t) Dis for all ie {1,..,K 4.2 
X (Rag) m forall ich.) (42) 


The POL index also tends towards zero for very homogeneous countries, i.e., with only 


one group. However, with increasing group numbers, ELF and POL show clearly different 


204 Ginsburgh and Weber (2011, Ch. 6) offer a good overview of the different classes of indices used, their 
historical development and recent applications. For a broad overview on general concepts and measures of 
ethnicity, see Brown and Langer (2010). 

205Their work is considerably based on Esteban and Ray (1994). 
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courses. While ELF is an increasing function of the number of groups, POL reaches its 
maximum with two equally sized groups and decreases afterwards. 

The DELF now takes the distance between these groups into account. The idea dates 
back to Greenberg (1956), and later Fearon (2003), who both proposed using linguistic 
similarities between language groups to cover the distance aspect. In addition to the 
language characteristic, the DELF includes information on the ethno-racial and religious 
characteristics of all groups.” The three characteristics are weighted to arrive at the 
composite DELF values.?" Aligned with the ELF index, the DELF index is calculated 
as: 

KK 
DELF=1-),)) pr pi $n, for all k,le{1,...,K} (4.3) 
k=11=1 


where the combined 3x1 is the equally weighted average of the similarity values of each 
ethnicity characteristic between all groups k,l € {1,...,K}. This new global data set is 
based on data from the World Christian Encyclopedia (Barrett et al., 2001) and offers 
ethnic diversity data for 210 countries. By construction, a close relationship to the ELF 
measure is evident. Both are influenced by the number of groups, which in a way determine 
the relative groups sizes — a key building block for both. Based on this additional aspect, 
accounting for the differences between groups, the DELF leads to significant differences 
between a country’s ELF and its DELF values. Figure 4.1 shows the ranks of all countries 
depending on its ELF and DELF values, where the highest values correspond with the 
rank of one.?0® Changes in the heterogeneity ranking of more than 30 places (indicated 
by the dotted lines) are quite common. Countries such as Zambia, the Republic of Congo, 
and Zimbabwe seem to be more homogeneous when using the DELF compared to their 
ELF values. Contrarily, Kazakhstan, Bahrain or the Sudan turn out to be more diverse 
than fragmented. 

In addition to the diversity data of single countries, the index offers information on 
nearly 44,000 dyadic relations between countries and their respective cultural distance. 
'This is a key advantage of the DELF as the ELF and POL do not allow for any assessment 
of ethnic differences between countries. For this reason, an index of genetic distance was 
often used, although one may raise some reasonable objection to its applicability??? It is 


based on a rather limited number of 42 distinct world populations for its calculation and 


206 wo other recent approaches consider a set of characteristics to assess the differences between groups. 
Bossert et al. (2011) combine ethnic and various socioeconomic differences between citizens to construct 
ELF values comprising of diversity for US counties. De Groot (2009) measures a broad set of cultural 
characteristics (e.g., language, religion) to assess the ethno-linguistic affinity between countries in Africa. 

207For more details on the distance calculations for each characteristic and the different weighting possi- 
bilities, see the Appendix C.2. 

208More details on the ELF and DELF values are found in Table C.7 of Appendix C.3. 

209The assessment of genetic distance can be traced back to the pioneering work of Cavalli-Sforza and 
Feldmann (1981), who created phylographic trees by mapping the differences in special sections of the 
human DNA. 
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Figure 4.1: Scatter plot of ELF and DELF rank values 


turns out to be a predictor for geographic distance from a common place within Africa 
(Giuliano et al., 2006; Ramachandran et al., 2005). The global DELF values between 


countries offer a good option to escape from these limitations. 


4.3 Implications of ethnic diversity on conflict 


The relationship between ethnic division and conflict is probably the most researched field 
regarding a possible impact of ethnicity. Due to this, different aspects of conflict have been 
highlighted. The analyses of its roots and influencing factors differ for the incidence, onset 
and duration of conflicts. While incidence measures if any form of conflict is currently 
occurring in a given country, its onset measures whether a specific conflict starts at a 
given time. For example, a country that experiences a two year conflict, exhibits a conflict 
incidence for both years, but conflict onset only in the first year. Finally, the duration 
measures the overall length of a given conflict. In the aforementioned example, this would 
be two years.?10 

For conflicts in general, greed, rather than grievance is held responsible (Collier and 
Hoeffler, 2004). More opportunities to hide in mountainous regions, the possibility of gain- 
ing higher amounts of natural resources, and lower opportunity costs for an impoverished 
population are brought forward as arguments. However, the more ethnic groups that are 
oppressed by the ruling regime raises the probability of revolts. Not only oppression, but 
marginalization and the intentional underdevelopment of groups not belonging to the rul- 
ing clan may raise tensions, which may develop into conflicts. In line with the greed and 


opportunity theories, a broad strand of literature relying on the ELF index has not found 


210For more details on the different conflict measures, see Bleaney and Dimico (2009). 
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strong empirical evidence for a relationship between ethnic fragmentation and any of the 
conflict measures (Fearon and Laitin, 2003).?!! Apparently the mere number of groups is 
not that relevant for conflict.?!? 

These arguments led Garcia-Montalvo and Reynal-Querol (2002) to develop a polar- 
ization index (POL) as a more relevant measure of relationship between ethnic division 
and conflict.?!? They argue that deviation from the situation of two equally strong groups, 
that might both seize power over the whole country, is more relevant for the incidence of 
conflicts than the fractionalization of a country. In general, polarization is indeed more 
robustly associated with the conflict measures. 

In a theoretical contribution, Caselli and Coleman (2008) stress the importance of 
potential excludability of the defeated party from economic or political gains. The possi- 
bility to exclude another group based on obvious barriers (physiognomic, language, ethnic) 
between them, raises the incentives to start a conflict. The distance between groups, mir- 
rored in the DELF index, should be a relevant factor for the consideration of whether or 
not to start a war. 

Garcia-Montalvo and Reynal-Querol (2005b) test the applicability of ethnic and reli- 
gious polarization against the respective fractionalization indices in the incidence of wars. 
They use data from the Peace Research Institute of Oslo (PRIO), which include intermedi- 
ate and high-intensity armed conflicts. A range of standard control variables ( GDP/capita, 
Population, Primary exports, Mountains, Contingency, and Democracy) are included in 
all regressions. The regressions in Table 4.1 are replications from the ones in the original 
article and use a logit model for the incidence of civil wars based on five-year periods. 
The ethnic polarization variable (Ethnic pol.) clearly outperforms the fractionalization 
variable (Ethnic frac.) in regards to the level of significance.?!^ All control variables carry 
the expected sign. 

The regressions in Table 4.2 now rebuild the approach of Garcia-Montalvo and Reynal- 
Querol (2005b). However, the fractionalization indices are replaced by the composite 
DELF and the DELFg.?!^ The higher significance of the polarization measure (Ethnic 
pol.) fades and gives way to the composite DELF. The coefficients for the control 
variables and their significances remain more or less unchanged. It is apparent that the 


DELF, covering the differences between groups, contains important information regarding 


?!1$6e also Collier and Hoeffler (2004) and Collier et al. (2009). 

?12Furthermore, the quantity of groups, demonstrating more divisions through which conflicts may arise, 
may make coalition building in order to create a strong enough power base more difficult. This may 
additionally impede a linear relationship of ELF and conflicts. 

213-This is based on earlier work of Esteban and Ray (1994). Garcia-Montalvo and Reynal-Querol (2005b, 
2010) further develop the polarization index and its application. See Blattman and Miguel (2010) for a 
broad literature overview. 

2l4The ethnic variables are also based on data from the World Christian Encyclopedia (Barrett et al., 
2001), whereas the religious measures are mainly built based on data from the L’ Etat des Religions dans 
le Monde (Clévenot, 1987). 

215 T6 be consistent, the fractionalization indices were also taken from the same data source as the DELF 
is calculated, i.e., the WCE. 
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(1) (2) (3) (4) (5) 
Conf. Conf. Conf. Conf. Conf. 
Ethnic frac. 1.19* 0.18 0.05 
(1.89) (0.19) (0.05) 
Ethnic pol. 2.38*** 2.29** 2.09** 
(2.97) (2.23) (2.03) 
Rel. frac. -4.97* -4.45 


(1.65) — (-1.39) 


Rel. pol. 3.90** 3.29 
(197) — (1.59) 


Ln (GDP/capita ^ -0.22 -0.44** ^ .0.42* -0.33 -0.38 
(1.27) (1.99) . (179) (C143)  (-133) 


Ln (Population) 0.35** 0.41** 0.40** 0.44*** — 0.44*** 


(2.18) (2.40) (2.21) (301) (2.72) 
Primary exp. -0.91 -1.01 -1.07 -0.35 -0.90 
(0.52) — (-0.54)  (-0.57) (0.221)  (-0.48) 
Mountains 0.00 -0.00 -0.00 0.00 -0.00 
(0.49) (-0.25) (-0.19) (0.29) (-0.16) 
Non contiguous 0.08 0.30 0.29 0.31 0.48 
(0.13) (0.49) (0.48) (0.49) (0.79) 
Democracy 0.08 0.03 0.03 0.02 -0.03 
(0.21) (0.09) (0.09) (0.05) ^ (-0.09) 
Constant -5.82** -6.23* -6.80** . -6.90**  -7.47** 
(2.06) — (1.93) . (2.01) (23:0) | (-2.32) 
Observations 846 846 846 846 846 
Pseudo R? 0.101 0.122 0.122 0.110 0.134 


Cluster robust ¢ statistics in parentheses 
* p< 0.10, ** p < 0.05, *** p < 0.01 


'Table 4.1: Original logit regression for the incidence of civil wars as found in Garcia-Montalvo 
and Reynal-Querol (2005b) 


the incidence of conflicts. In line with the contribution of Caselli and Coleman (2008), 
obvious barriers should play a role in this decision. That the composite DELF, covering 
all characteristics, has a significant impact compared to the DELF solely based on religion, 
confirms their theoretical arguments. In most cases, religious identification may not be an 
obvious enough characteristic to rule out future assimilation. 

Having found a significant impact of DELF on the incidence of conflicts does not, 
however, allow one to infer its applicability onto other conflict measures, namely its onset 
and duration. Further research deems necessary to fully understand the dynamics as to 
how the different aspects of a country's ethnic composition affects the different phases of 


conflict. 


4.4 Implications of ethnic diversity on growth 


The second most prominent question of ethnicity's role is whether, and how it affects 
economic growth. This was the starting point for the seminal paper of Easterly and 
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m 3 3) © © 
Conf. Conf. Conf. Conf. Conf. 
DELF 2.40* 2.55** 4.25** 
(1.87) (2.05) (2.31) 

Ethnic pol. (WCE) 0.45 0.74 0.28 
(0.39) (0.64) (0.22) 

DELFR -9.59 -12.48 
(-1.38) (-1.53) 

Rel. pol. (WCE) 5.98 6.43 
(1.35) (1.30) 

Ln (GDP/capita) -0.35 -0.47* -0.40 -0.42* -0.47 
(-1.48)  (-L87)  (-1.50)  (-175) (1.64) 
Ln (Population) 0.39*** — Q.40** 0.39** — Q.42*** 0.45*** 
(2.59) (2.56) | (2.54) (2.75) | (3.37) 

Primary exp. -0.96 -0.27 -1.00 0.07 -0.46 
(-0.48)  (-0.16) — (-0.50) — (0.04)  (-0.26) 

Mountains 0.00 0.00 0.00 0.00 0.00 
(0.54) (0.30) (0.41) (0.34) (0.17) 

Non contiguous 0.10 0.12 0.21 0.05 0.13 
(0.16) (0.18) (0.31) (0.08) (0.18) 

Democracy 0.03 0.08 0.02 0.06 -0.02 
(0.09) (0.22) (0.05) (0.16) (-0.07) 
Constant -6.17** -4.94* -6.10** -5.58* -6.52** 
(2.15) (1.70)  (-2.09)  (-1.92) (2.35) 

Observations 833 833 833 833 833 
Pseudo R? 0.108 0.092 0.110 0.101 0.128 


Cluster robust £ statistics in parentheses 
*p< 0.0, ** p < 0.05, *** p 0.01 


'Table 4.2: Logit regression for the incidence of civil wars, based on Garcia-Montalvo and Reynal- 
Querol (2005b) 


Levine (1997), who concluded that Africa's lower growth rate can be explained to a large 
extent by its higher ethnic fragmentation. Their approach was extended and updated 
with new ELF data by Alesina et al. (2003). Subsequently, Schüler and Weisbrod (2010) 
added an additional decade of observation and thus based their analysis on a broader 


foundation.?!6 


They all very much confirm the negative effect of the ELF on a country's 
growth rate. For more developed countries with better education and infrastructure, this 
effect is found to be less detrimental (Alesina and La Ferrara, 2005). Ethnic diversity 
might even be a driver of innovation for these countries and should thus affect growth in a 
positive way. Nevertheless, cooperation is apparently more difficult in more heterogeneous 
countries so it is natural to question the DELF’s role in economic growth. As the data 
compiled by Schüler and Weisbrod (2010) offers the most observations, it seems obvious 


to replicate their analyses. 


216 Whereas Alesina et al. (2003) covered the period from 1960 to 1989, Schüler and Weisbrod (2010) 
expand the data to cover the period 1960 to 1999. 
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m 3 3 3 
Growth Growth Growth Growth 
Africa -0.009** — -0.014*** — -0.012*** -0.016*** 
(-2.66) (-3.47) (-3.67) (-4.27) 
La. America -0.016*** -0.014*** -0.018*** -0.016*** 
(5.93) (-4.63) (-6.60) (-5.08) 
Ln (GDP/cap.) 0.041*** 0.027 0.045*** 0.030 
(2.71) (1.46) (2.89) (1.59) 
(Ln (GDP/cap.))? -0.003*** ^ .0.003** ^ .0.003*** -0.003*** 
(2.99) (-2.47) (-3.08) (2.61) 
Ln (Schooling) 0.011*** 0.003 0.011*** 0.003 
(3.40) (0.66) (3.36) (0.66) 
Assassinations -21.103** -19.766** 
(-3.48) (-2.18) 
Financial depth 0.009** 0.010** 
(2.14) (2.00) 
Black market premium -0.021*** -0.021*** 
(-5.34) (5.31) 
Fiscal surplus/GDP -0.000** -0.000* 
(-1.81) (-1.91) 
Ln (Telephones/worker) 0.016*** 0.017*** 
(3.15) (3.34) 
ELF (Alesina) -0.019*** -0.012** 
(-3.86) (-2.20) 
DELF -0.017** -0.005 
(-2.34) (-0.58) 

: 83/887 38/677 81/87/ 38/67/ 
Observations 94/92 74/80 93/91 74/79 
R 0.24/0.24/ — 0.46/0.45/ — 0.21/0.20/ 0.46/0.44/ 

0.36/0.16 0.49/0.30 0.35/0.12 0.48/0.27 


Robust t statistics in parentheses; observation and R? values are decade specific 
* p « 0.10, ** p< 0.05, *** p< 0.01 
Growth is measured as annual growth rate of per capita GDP 


Table 4.3: Influence of ethnic diversity on economic growth, based on Schüler and Weisbrod 
(2010) 


Table 4.3 shows, in regressions (1) and (2), replications of the original growth regres- 
sions of Schüler and Weisbrod (2010).?!7 Regression (1) contains only limited control 
variables that are supposed to influence the economic development of countries. Both 
regional dummies, for Africa and Latin America, are negative and significant at the 5% 
and 196 levels. The income level (GDP/cap.) at the beginning of each decade shows a 
catch-up effect, at a slightly diminishing rate as its squared term is negative but with 


a very small coefficient. As one expects, Schooling has a significant effect on increasing 


217 The regressions here are run, in line with Schüler and Weisbrod (2010), using seemingly unrelated 
regressions (SUR). SUR is used to allow for country random effects to be correlated across decades, in 
order to increase the efficiency of the estimators. However, comparing the results to a model run with 
robust OLS regressions and decade dummies displays nearly no differences. Thus, the decade correlations 
seem to be very limited. 
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growth.?18 Finally, ethnic fractionalization (ELF), based on the data compiled by Alesina 
et al. (2003), reveals a detrimental growth effect. A completely homogeneous country 
would expect an almost 2% higher rate of annual growth compared to a completely frac- 
tionalized country. The different level of ethnic fractionalization between Korea and Cóte 
d'Ivoire is thus responsible for roughly 1.6% of their growth rate differential.?!? Regression 
(2) mow includes a broad set of variables affecting growth. The number of Assassinations, 
the Black market premium and the Fiscal surplus all negatively affect growth at highly 
significant levels of 196. Financial depth and the number of Telephones per worker are 
used as proxies for the level of infrastructure in a country, with both showing a growth 
enhancing potential and being highly significant. As ethnic fractionalization per se can 
hardly impact upon growth, all of these variables are meant to be channels through which 
ethnic fragmentation affects growth. This is supported by a high correlation between the 
ELF and those variables. Indeed, Easterly and Levine (1997) and Alesina et al. (2003) 
find a vanishing effect of the ELF as the number of covariates included in the regressions 
increases, until it becomes equally insignificant. By including data from the 1990s, Schüler 
and Weisbrod (2010) find a robust, albeit smaller, negative effect of the ELF on growth, 
controlling for all other variables. This, therefore, still confirms that the ELF potentially 


works through affecting these variables.??° 


Regression (3) now exchanges the ELF values with DELF values. Nearly all of the co- 
efficients and significance levels remain relatively unchanged and, interestingly, the DELF 
displays nearly the same coefficient as the ELF. However, it loses its significant impact 
when all controls are included in regression (4), as in the articles of Easterly and Levine 
(1997) and Alesina et al. (2003). Although the coefficients look similar their economic 
impact differs. Whereas an increase of one standard deviation in the ELF reduces growth 
by 0.56 percentage points, the same increase in the DELF would only lead to a reduction 
in growth of 0.29 percentage points.??! Again comparing Korea and Côte d'Ivoire, the dif- 
ference in their respective DELF levels is responsible for slightly less than one percentage 
point of their growth rate differential??? Thus, ethnic diversity seems to be less detrimen- 
tal to economic growth than the ELF. As both affect growth through different variables 


(channels), a more detailed analysis on ELF and DELF effects is deemed necessary here. 


218\feasured as the average years of total school attainment at the start of the decade. 
219m the data of Alesina et al. (2003) Korea has an ELF of 0.002, whereas Côte d'Ivoire has an ELF of 
0.82. 
?20Tf one uses the ELF index based on the same data as the DELF (WCE data), its effect remains highly 
significant in regression (1) but fades again in regression (2). 
Z?VThe standard deviation of ELF is 0.27, whereas it is only 0.16 for the DELF. For the annual growth 
rate the standard deviation is 0.027. 
222Korea has a DELF of 0.032, whereas Côte d'Ivoire has a DELF of 0.586. 
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However, for these basic regressions, the sheer number of groups are more robust than 
taking their differences additionally into account.??? 

Despite the finding that heterogeneity negatively affects economic growth, one can 
question if this relation changes in different country settings. The reasoning for this is 
promoted mainly by articles analyzing metropolitan regions and companies.??^ They often 
find that ethnic heterogeneity has a positive effect on innovation and productivity. With 
very comparable data to that above, Alesina and La Ferrara (2005) try to prove this 
positive effect in a large scale cross country analysis. They show that the detrimental 
effect of ethnic heterogeneity does indeed fade for more economically developed countries. 
The original paper, however, relies on the limited dataset from 1960-1989. 

Table 4.4 replicates the analysis of Alesina and La Ferrara (2005) with the extended 
data of Schüler and Weisbrod (2010).75 Following the argumentation of Alesina and 
La Ferrara (2005) that richer countries are less prone to the ELF's detrimental effect, the 
heterogeneity measures (ELF and DELF) are both interacted with the countries’ income 
levels (GDP/cap.). The negative effects for ELF and DELF remain in regressions (1) and 
(2), although they are no longer or only marginally significant at conventional levels. The 
same is true for the interaction terms of the heterogeneity measures (ELF/ DELF) and 
the level of initial income (GDP/cap.). Thus, the finding of Alesina and La Ferrara (2005) 
cannot be confirmed for the extended time period. However, it is questionable whether a 
mere higher income level is the basis for diversity to deliver benefits to a country. Countries 
need instead to establish a common base that allows the different groups to interact in 
a productive way. An indicator reflecting a broader perspective of development is the 
Human Development Indicator (HDI). Regression (3) and (4) include the HDI level and an 
interaction term with the heterogeneity indices, replacing the income level used before.??* 
The ELF and DELF again enter the regression with the familiar significant negative 
effect, although the DELF is slightly less significant. More interestingly, the interaction 


223 The static nature of both the ELF and the DELF calls for an important caveat. Chapter 2 showed 
that the level of ethnic heterogeneity in a country is changing and makes inter alia education responsible 
for this. Although the ethnic setup of a country does not change quickly, an analysis covering four decades, 
with a single static ethnic measure, requires some caution. 

?24Ottaviano and Peri (2005) show that native US citizens receive higher wages in metropolitan areas 
where ethnic heterogeneity is increasing. Again for the US, Sparber (2010) confirms a productivity in- 
creasing effect of racial diversity for cities, but less so at the state level. Florida (2004) argues that a more 
diverse agglomeration of creative capital increases innovations and ultimately economic growth. Equally, 
Niebuhr (2010) finds that cultural diversity raises innovative activity, positively affecting the performance 
of regional research and development (R&D) sectors in Germany. Ozgen et al. (2011a) show a comparable 
result for European regions. Watson et al. (1993) show empirically that more diverse teams need longer 
to establish a common understanding, but if that is reached they outperform more homogeneous groups. 
Similarly, Prat (2002) shows, in a game theoretical analysis, that the positive impact of a heterogeneous 
versus a homogeneous team depends on the complementarity of their tasks. A comparable result is also 
found in Hong and Page (1998). 

225 The regressions are again run with robust OLS and decade dummies that are not explicitly reported. 
[his is comparable to multicultural companies that need to enforce a common understanding between 
its diverse employees to profit from their different backgrounds. 


2267 


227 Besides purely economic measures, the HDI includes differences in its educational and health levels. 
The data is taken from UNDP - United Nations Development Programme (1994). 
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qj 3 B) D © 
Growth Growth Growth Growth Growth 
Africa -0.012*** -0.016*** -0.010** -0.015*** -0.012*** 
(-2.66) (-4.02) (-2.32) (-3.83) (-2.66) 
La. America -0.017*** -0.018*** -0.017*** -0.018*** -0.016*** 
(-5.25) (-5.78) (-5.32) (5.79) (-4.80) 
Ln (GDP/cap.) 0.017 0.026 0.023 0.033* 0.031 
(0.79) (1.37) (1.19) (1.73) (1.58) 
(Ln (GDP/cap.))? -0.002** ^ .0.003*** ^ .0.003** ^ .0.003***  -0.003*** 
(-1.99) (-2.73) (-2.27) (-2.89) (-2.68) 
Ln (Schooling) 0.002 0.002 -0.000 -0.002 -0.003 
(0.32) (0.35) (0.03) (0.43) (-0.55) 
Assassinations -18.945** -17.614* -19.129** -17.434* -20.922** 
(-1.99) (-1.83) (-2.02) (-1.82) (-2.19) 
Financial depth 0.008* 0.007* 0.008 0.007 0.008* 
(1.65) (1.52) (1.65) (1.55) (1.80) 
Black market premium -0.020*** -0.020*** -0.021*** -0.020*** -0.020*** 
(5.02) (-4.90) (-5.08) (-4.99) (-4.98) 
Fiscal surplus/GDP -0.000* -0.000* -0.000* -0.000* -0.000* 
(-1.85) (-1.93) (-1.93) (-1.95) (-1.74) 
Ln (Telephones per worker) 0.014** 0.014*** 0.014*** 0.014*** 0.013*** 
(2.44) (2.41) (2.50) (2.48) (2.25) 
HDI 0.025 0.033 0.009 0.015 0.011 
(1.19) (1.55) (0.38) (0.69) (0.51) 
ELF (Alesina) -0.066 -0.034** -0.011 
(-1.56) (-2.70) (-0.61) 
ELF * Ln (GDP/cap.) 0.006 
(1.31) 
ELF * HDI 0.038** -0.016 
(1.74) (-0.52) 
DELF -0.125* -0.042** -0.044 
(-1.84) (-2.01) (-1.64) 
DELF * Ln (GDP/cap.) 0.015* 
(1.82) 
DELF * HDI 0.070** 0.109** 
(2.10) (2.22) 
"E 38/657 38/657 38/657 38/65/ 38/657 
Observations 71/76 71/75 71/76 71/75 71/75 
R 0.44/0.43/ — 0.46/0.41/ 0.43/0.43/ 0.44/0.41/ — 0.44/0.42/ 
i 0.51/0.34 0.49/0.34 0.51/0.34 0.50/0.35 0.50/0.40 


Robust t statistics in parentheses; observation and R” values are decade specific 
* p « 0.10, ** p< 0.05, *** p< 0.01 
Growth is measured as annual growth rate of per capita GDP 


Table 4.4: Influence of ethnic diversity on economic growth depending on economic and human 
development levels, based on Alesina and La Ferrara (2005) 


terms reveal a new result. Both the ethnic fractionalization and ethnic diversity indices 


show a positive impact for more developed countries. Regression (5) includes both the 


ELF and the DELF indices, as well as their interaction terms. Whereas most ethnicity 
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variables are now insignificant, the interaction term of DELF with the HDI level remains 
positive and significant, albeit at a reduced level of 10%. Figure D.1 of Appendix D.1 
shows the average marginal effect of the DELF dependent on the HDI level for both last 
regressions. In the case of regression (4) the DELF exhibits a positive impact for a HDI 
level of 0.6 and above. This corresponds with countries like Indonesia or the Philippines.??® 
for regression (5) the threshold for a positive implication is already at 0.4. A positive and 
significant effect (at the 576 level) is found for HDI level of 0.7 and above. For example, 
Paraguay, Tunisia, and Turkey exhibited such a level of development for the 1990s. This 
result confirms the expectation that ethnic diversity, in contrast to its mere heterogeneity, 
has a positive impact on the economic growth of a country. 

Using the broader data from Schüler and Weisbrod (2010), Alesina and La Ferrara's 
(2005) result of a positive impact of ethnic heterogeneity depending on a country's income 
level, cannot be confirmed. However, a new insight is generated by using a broader 
approach related to a country's level of development. Countries that rank higher in the 
HDI may well harvest the positive effects of ethnic diversity. This is an important finding 
as it is a good basis for challenging the common understanding that ethnic diversity, in 
the economic context, has negative consequences. If the right conditions are in place, it 
seems that it can support a country's economic success. However, the potential innovative 


power of ethnic diversity only unfolds in countries that can cope with its adversary effects. 


4.5 Implications of ethnic diversity on trust 


Trust between citizens or between countries can be an influential factor in various economic 
fields. It can be the root of the difficulty to agree on public goods (Alesina et al., 1999; 
Desmet et al., 2009), or be responsible for conflicts between countries (de Groot, 2009). 
As trust, shared values or opinions are generally hard to measure, and data are seldom 
readily available, a growing literature is devoted to discovering the roots of these factors. 
Alesina and La Ferrara (2002) look for factors associated with trust between citizens in 
the US, and Bjørnskov (2007, 2008) does this for a large set of countries.??? Both articles 
find that economic and political opportunities play an important role, as well as cultural 
aspects. For a set of 100 countries, Bjornskov (2008) finds that countries with a predomi- 
nantly Catholic or Muslim population are less trusting, while ethnic fractionalization does 


t.239 Comparable to the discussion on conflict, employing the appropriate 


not affect trus 
measure of ethnicity or culture also seems to be crucial in identifying an effect in this 


strand of the literature. Diversity in this respect, again, offers an interesting new facet. 


228 These HDI values correspond to the latest decade in the data, starting with 1990. 

229 go, a related, yet somewhat different, approach to assessing differences between countries (instead of 
within countries), see Desmet et al. (2011). They use a broad range of responses on cultural values from 
the World Value Survey, for countries in the European Union, to construct a measure of cultural distance. 


230 Equally, La Porta et al. (1999) associate countries with mainly Catholic and Muslim populations with 
inferior government performance. 
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That one does not find an influence of ethnic diversity on trust is arguably due to the 
fact that the fractionalization measure (ELF) does not measure diversity correctly, as it 
neglects the group differences.?! The varying distances between groups should be more 
relevant for the level of trust than the mere number of groups. Hence, regressions (1)-(3) 
in Table 4.5 replicate the major regressions of Bjørnskov (2008).??? The regressions are 
performed using simple OLS with robust standard errors, with a ten year average since 
2000 being employed for most of the variables. Social trust within countries is based 
mainly on the results of the World Value Survey (Inglehart et al., 2004).?°? Having a 
higher level of Income inequality, as well as being a Post communist country, reduces trust, 
whereas Monarchies (Monarchy) and Nordic countries are, in general, more trusting.??* 
High political diversity is associated with a significantly lower level of trust. Political 
competition, however, does not seem to play a major role.??° The dummies for countries 
that have a dominant religion, exhibit a negative impact for Catholic and Muslim countries, 
albeit slightly less for the latter. Regressions (4)-(6) now re-run the first three, this 
time replacing the fractionalization measure (ELF) with the diversity measure (DELF). 
Contrary to expectations, the DELF regressions remain equally insignificant. The impact 
on all other variables is also very limited. 

One must consider that ethnic fractions may be salient under one condition, and less 
so under another. In addition, these conditions might not been included in the original 
regressions. Bjornskov (2008) reasons on the same grounds, arguing that changes in cit- 
izens' sensitivity towards ethnic diversity changes over the course of development in a 
country. Better institutions and social systems improve trust between groups, whereas 
rising income inequality causes more frictions. Another explanation might be differences 
in the segregation of ethnic communities. The trust one has in fellow citizens depends on 
the people that individual is surrounded by. In the case of highly segregated communities 
within a country, the group an individual trusts in is very homogeneous. Thus, a high 
amount of trust in ones fellow (homogeneous) community can coexist with high diversity 
in a country as long as the different ethnic groups are segregated. As it stands, the role 
of ethnicity cannot be answered on the grounds of the above findings. 

In a comparable approach, Disdier and Mayer (2007) try to analyze the roots of bilat- 


eral opinions. Although trust and opinions are not exactly the same, they are often used 


231 Bjornskov (2008) applies the ELF measure compiled by Alesina et al. (2003). If one uses the ELF 
based on WCE data, the results remain relatively unchanged 

?3? The coefficients and significance levels slightly differ from the original paper as the data used for the 
replication are based on an updated version covering additional countries. The outcomes are, however, not 
affected. 

233 Trust is measured on a scale between zero and 100, and reflects the percentage of the population 
answering positive to the question: ‘In general, do you think that most people can be trusted? For some 
countries, the social trust value is derived from the Afrobarometer and the Latinobarometro surveys with 
a comparable question. 

234 These include Denmark, Finland, Iceland, Norway and Sweden. 

235 Political diversity is defined as the variance of political self-placement based on WVS data. Political 
competition is the Herfindahl-Hirschman index of legislature based on the Database of Political Institutions 
from the World Bank. 
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a) (2) (3) (4) (5) (6) 
Trust Trust Trust Trust Trust Trust 
Income inequality -0.47*** — .Q.30*** — .0.51*** OPF  -0.34*** Qpr 
(5.33) (3.23) — (5.50) — (55.92) — (3.67)  (-5.48) 
Post communist -6.16** -6.00** -8.09*** — -7.11*** — .6.95*** -T.98** 
(-2.43) (2.653) . (2.77) (-2.00  (-2.:) (2.31) 
Monarchy 8.33*** 7.38** 8.71*** 8.66*** 7.13** 9.01*** 
(3.33) (2.61) (3.27) (3.19) (2.33) (3.23) 
Nordic country 16.01*** 15.59%% 16.59*%**  16.77*** — 15.96** 1'7.65*** 
(2.96) (2.54) (2.72) (3.20) (2.57) (2.90) 
Political diversity -2.02°** -1.99*** 
(-3.87) (-3.76) 
Political comp. ('80-'05) 3.58 2.87 
(0.51) (0.36) 
Protestant 0.08 0.08 0.07 0.08 0.07 0.07 
(1.33) (1.15) (1.15) (1.22) (1.07) (1.05) 
Muslim -0.06 -0.07** -0.07* -0.06 -0.07** -0.06 
(1.60) (2.17%) (1.86) (161) (C244) (1.65) 
Catholic -0.06** -0.08*** -0.06* -0.06* -0.08** -0.05* 
(2.28) (2.82) (1.9) (19) (24) (17) 
Eastern -0.02 0.01 -0.03 -0.03 0.02 -0.03 
(0.38) (0.24) (0.50)  (-0.40) (0.40) (-0.45) 
ELF (Alesina) -2.97 -2.00 -3.49 
(-0.77) (-0.49) (-0.90) 
DELF -1.30 -0.94 -1.37 
(0.18) — (011)  (-0.20) 
Constant 4T.A1***  51.56***  AT7.46*** — A7.94*** — 52,70*** — 46.89*** 
(9.62) (10.27) (8.49) (9.02) (9.40) (8.41) 
Observations 113 89 110 109 85 107 
Adjusted R? 0.565 0.675 0.571 0.572 0.675 0.567 
F-Test 30.65 31.86 28.68 31.22 30.19 28.66 


Robust t statistics in parentheses 
* p « 0.10, ** p « 0.05, *** p — 0.01 


Table 4.5: Determinants of social trust, based on Bjgrnskov (2008) 


in an equivalent manner. Disdier and Mayer (2007) use the opinions of EU member coun- 
tries (EU15) towards the Central and Eastern European Countries (CEEC) before their 
admission to the EU in 2004.76 A link between opinions and trust is clearly detectable 
here. The more positive the opinion of an EU member is towards an accession country, the 
higher their trust of this country will be. Disdier and Mayer (2007) also try to separate 
economic effects from the cultural affinity factors influencing the positive public opinions 
towards the accession of the respective countries. 

The dependant variable is the percentage of respondents that support the enlargement to 
a given CEEC country in a Eurobarometer survey of the EU15 countries. The regressions 


use robust OLS estimators with country and time fixed effects. Table 4.6 contains the 


236 The countries covered were Bulgaria, the Czech Republic, Estonia, Hungary, Latvia, Lithuania, Poland, 
Romania, the Slovak Republic and Slovenia. Romania and Bulgaria only joined in 2007. 
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(1) (2) (3) (4) (5) (6) (7) (8) 
Opin. Opin. Opin. Opin. Opin. Opin. Opin. Opin. 

Lang. prox. 0.54*** 0.59*** 

(4.64) (5.22) 
Relig. prox. -0.30*** -0.35*** 

(2.90) (3.26) 
Asylum seekers — 0.04** 0.04** 0.03** 0.04** 0.03* 0.03* 

(2.14) (2.21) (2.06) (2.14) (1.66) (1.70) 
Book imports -0.00 -0.00 -0.00 -0.00 -0.00 -0.00 

(1.29) (31.32) —— (-0.86) (-0.83) (1.23) — (149) 
Conflict years -0.02*** — -0.01*** . -0.01*** . -0.01** -0.01** -0.01** 

(73.04) (2.89)  (-2.60) (-2.39) (2.58) — (24) 
UN voting Spe 3.46*** 3.73"** 3.45*** 9.49eiee EAG aiaa 

(5.75) (5.37) (5.72) (5.29) (5.42) (5.06) 
DELF, -0.46*** -0.51*** 

(-4.13) (-4.60) 
DELFR 7.02 7.38 
(1.20) (1.24) 
DELF -0.92*** -0.99*** .0.98*** -1.01*** 
(5.29) (5.85) (6.08) (-6.59) 

Ln (Imports) 0.06*** 0.06*** 0.05*** 0.03* 

(3.15) (3.52) (2.93) (1.79) 
Ln (Exports) 0.05*** 0.05*** 0.04** 0.05*** 

(2.67) (2.69) (2.26) (2.69) 

GDP/cap. diff. -0.11* -0.11* -0.11* -0.11* -0.09 -0.09 -0.14*** -0.15*** 


(-1.82) (-1.92) (-1.80) (-1.87) (-1.51) (-1.54) (-3.59) (-3.89) 


EU budget cont.  0.24*** 0.24*** 0.24*** 0.24*** 0.24*** 0.24*** 017 0.16%** 


(6.09) (6.19) (6.03) (6.13) (6.20) (6.30) (5.79) (5.61) 
EC benefits 0.66*** 0.66*** 0.66*** 0.66*** 0.66*** 0.66*** 0.57*** 0.55*** 
(6.32) (6.36) (6.29) (6.33) (6.41) (6.46) (6.40) (6.28) 
Ln (Distance) -0.32*** | -0.34*** — .0.26*** — -0.28*** — -0.27*** — .0.29*** — -0.36***  -0,32*** 
(-5.35) (5.7) (-4.27) (-4.57) (4.55) — (A87) (7.85) (-6.06) 
Constant -16.93 -20.82 -16.01 -19.00 -14.28 -16.74 64.22*** — 60.85*** 
(0.65) (0.80) (0.63) . (0.74) . (0.56) (0.66) (3.70) (3.53) 
Observations 677 677 677 677 677 677 860 860 
Adjusted R? 0.825 0.824 0.822 0.821 0.826 0.825 0.808 0.809 
F-Test 104.03 102.11 100.47 98.52 103.76 102.42 126.38 124.89 


Country and year fixed effects were used; robust ¢ statistics in parentheses 
* p « 0.10, ** p< 0.05, *** p< 0.01 


Table 4.6: Influence of economic and cultural affinity factors on bilateral opinions, based on 
Disdier and Mayer (2007) 


replicated and extended regressions of Disdier and Mayer (2007). The first two regressions 


are those replicated without any changes. As cultural affinity factors, Disdier and Mayer 


237 


(2007) include Language proximity and Religious proximity, as well as the share of 


237 For the continuous language proximity measure, Disdier and Mayer (2007) use Ethnologue data (Lewis, 
2009) with language distances calculated comparable to Fearon (2003). For the religious proximity, data 
from the Encyclopaedia Britannica (2007) are used. 
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Asylum seekers and the volume of Book imports from the accession country. The Language 
proximity index shows a significant, positive sign, as one would expect. The Religious 
proximity index, however, displays a significant negative impact on bilateral opinion. This 
is rather surprising and upon inquiry, unfortunately, also the authors could not offer an 
explanation for this finding. The higher share of Asylum seekers shows the expected 
positive sign and is significant at the 596 level. The final proxy for affinity between two 
countries, the amount of Book imports, is, in contrast, never significant. To assess historical 
frictions between countries, the number of military incidents in the period 1870-1989 
(Conflict years) were included and show a negative impact. Recent political proximity 
is accounted for through correlation in voting behavior in the General Assembly of the 
United Nations (UN voting). Political cooperation on global topics influences opinion in 


a positive way, and all economic factors display the expected direction of influence.??5 


In regressions (3) and (4), the indices of language and religion proximity are replaced 
by the language and religion DELF values. This substitution leaves all other variables 
and the overall fit of the model nearly unchanged. A higher language distance deteriorates 
bilateral opinion as language differences probably reduce the information one has about 
the other country, for example, due to less news coverage.??? The religious distance, how- 
ever, has no significant impact." Regressions (5) and (6) include the composite DELF, 
instead of the ones with single characteristics. The DELF coefficient is higher than the 
language proximity and DELFT, index. Again, all other variables are only impacted upon 
marginally. As the DELF is supposed to be a good proxy for the cultural affinity between 
nations, all the affinity factors of Disdier and Mayer (2007) are replaced by the composite 
DELF index in the last two regressions. This leads to a noticeably increased number of 
observations. The coefficients for the DELF variable increase in size, remaining highly 
significant. A higher cultural distance between two countries is associated with a more 
negative opinion towards the accession of the other respective country. This effect is 
robust and enters the regression with a higher value than the other cultural affinity prox- 
ies.24! The overall fit of the model is more or less unaffected.242 Based on these results, 
the DELF index is a very good proxy for trust or opinions between countries and may 


substitute a whole range of affinity factors deemed relevant for bilateral opinions. 


238 Other than the displayed variables in the regression table, variables for population, unemployment rate 
differences, common border and imports of newspapers were included. Because they were not significant 
in any of the regressions, they are not displayed here. 

?39Note that the proximity measures and the DELF index enter the regression with opposite signs, 
because the former measures affinity and the latter, distance. 

240This finding points to a possible mismeasurement in the religious proximity index of Disdier and 
Mayer (2007), and thus, the potential source of the unexpected negative influence found in the original 
two regressions. 

241 The beta coefficient of the DELF is at least twice the size of that for asylum seekers. 

242 The biggest change is found for the differences in GDP per capita levels, which is now significant, 
even at the 1% level. The share of asylum seekers in the former regression probably absorbed most of this 
influence. 
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4.6 Implications of ethnic diversity on trade 


There are two main channels through which cultural affinity between nations is supposed 
to promote trade (Combes et al., 2005). Higher cultural affinity is aligned with better 
mutual understanding and knowledge. For trade, this translates into reduced transaction 
costs. Both agents better understand the conditions in the other country, and dealing with 
judgment on legal matters, for instance, as well as activity planning, is somewhat easier. 
Access to information on legal restrictions, consumer behavior or the practices of their 
local business partners is less costly. The second channel promotes trade via preferences, 
i.e., migrants often entrain their preferences for goods and services from their home coun- 
try. Spreading these new products throughout their new host countries expands demand 
beyond their own migrant group, and intensifies mutual trade flows. Both channels are 
boosted by a higher stock of immigrants, as well as generally higher cultural affinity and 


understanding between respective nations.?*3 


The trade increasing effect of cultural proximity is the focus of Felbermayr and Toubal 
(2010). In a standard gravity trade model, they show that trade volumes are increased 
by a higher cultural affinity between both nations. Their sample consists of 32, mainly 
European, countries and covers the period from 1965 to 2003. Felbermayr and Toubal 
(2010) proxy cultural affinity by using the mutual voting behavior from the Eurovision 
Song Contest (ESC).?44 The major advantage of using ESC voting, as a cultural affinity 
measure is that it does not necessarily need to be symmetric between two countries. 
Indeed, it seldom is. Additionally, it may vary over time, as the contest is held on a 
yearly basis. Basically, all conventional measures lack these features. Felbermayr and 
Toubal (2010) find that the trade increasing effect of cultural proximity is much higher for 
differentiated goods than for homogeneous goods, where essentially no significant effect is 
found. 

Again, the main findings of Felbermayr and Toubal (2010) are reproduced in Table 
4.7. As the ESC data are time variant and not symmetric between countries, Felbermayr 
and Toubal (2010) can apply more elaborate econometric models to take advantage of this 
additional information. As the DELF lacks this additional information, all regressions in 
Table 4.7 are performed in a slightly limited way by using cluster robust OLS models with 


importer and year fixed effects.?® 


243m general, most papers find a positive correlation between migration and trade. See, for example, 
Rauch (2001) or Combes et al. (2005). Wagner et al. (2002) compare a broader set of articles and outline 
their different approaches, leading to different elasticities of migration regarding trade. The Economist just 
recently chose the role diasporas play economic activities across borders as its cover story (The Economist, 
2011). 
2^ phe ESC is an annual song competition, where each country casts votes for the song from other 
countries to determine the winner of the competition. 
245m their gravity models, Felbermayr and Toubal (2010) use a complete set of interaction terms for im- 
porter/exporter and year fixed effects. Indeed, they show that standard OLS regression would significantly 
underestimate the effect of cultural proximity. If anything, applying more standard econometric strategies 
is likely to underestimate the results. As a consequence, the discussed results here are thus not an exact 
replication, but adapted regressions. 
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The first four regressions use aggregate imports as the dependent variable. Regres- 
sion (1) includes standard control variables for trade costs and a set of cultural affinity 
variables. Transportation costs are covered in the controls for geographical proximity (Dis- 
tance between main cities and a Common border dummy), and the formal trade policy is 
covered by the joint participation in a free trade area (Common FTA). A higher distance 
lowers the volume of the bilateral trade, but this is not to say that two nearby countries 


necessarily trade more. FTA membership shows a significant effect on aggregate imports, 


whereas a Common legal origin does not. The included standard set of cultural affinity 
variables is meant to account for the reduced transaction costs in more proximal countries. 
A Common language is not significant, whereas Ethnic ties, as expected, promote trade 
at the 1% level of significance. Religious proximity does not exhibit any impact. 

Regression (2) then includes the core measure for cultural affinity used by Felbermayr 
and Toubal (2010), the ESC scores. As they are not symmetric, both voting behaviors 
are included. ESC; is thus the voting behavior of the importing country towards the 
exporting country and ESC;; depicts the reverse situation. In contrast to the set of 
cultural affinity variables in regression (1), the mutual ESC scores are attributed to the 
second channel influencing trade volume, i.e., in the form of higher preferences (Felbermayr 
and Toubal, 2010).?4° Due to the standard regression methods used, the ESC variables 
emerge less significant than in the original regressions of Felbermayr and Toubal (2010) and 
only ESC;; is significant at the 1% level. Still, a higher affinity, measured by the higher 
ESC;; voting behavior, increases aggregate trade volumes. All other variables are only 
marginally affected. Finally, regression (3) includes the DELF measure. As the DELF 
measures the cultural distance between countries instead of affinity, the fact that the 
resulting sign is the opposite is in line with what is expected. A higher cultural distance 
lowers aggregated imports. The DELF and the ESC scores are conjointly included in 
regression (4). The coefficients and significance levels are only marginally affected, if at all. 
Both variables measuring cultural affinity are jointly relevant, whereas neither Common 
language nor Religious proximity are significant. The stock of migrants, however, is still 
highly significant and remains so throughout all regressions. The trade reducing effect of 
a higher diversity between two countries is sizeable. A by one standard deviation higher 
DELF values (0.22) is associated with nearly 30% lower imports.?47 

Regressions (5)-(7) and (9)-(11) re-run the estimations (2)-(4), this time splitting 
imports between homogeneous and differentiated goods. Trade of homogeneous goods is 
executed through organized exchanges that partly overcome the information and trans- 
action costs, with differences in preferences being irrelevant for these kinds of goods. A 
Common FTA remains highly significant. For homogeneous goods, a Common legal origin 


turns out to increase imports. This suggests that these variables have an influence on the 


246 At least for the ethnic ties and religious proximity variables, one could argue that they promote both 
channels. 
AT For example, is the DELF between Germany and Switzerland 0.31, whereas it is between Germany 
and Cyprus 0.50. 
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transaction cost channel (translating/ contracting) rather than the channel based on pref- 
erences (Felbermayr and Toubal, 2010). The DELF index, in contrast, has no significant 
influence on trade of homogeneous goods at conventional levels; nor do the ESC scores. 
For differentiated goods, the DELF becomes highly significant at the 1 % level. Also, 
both ESC variables impact imports significantly, although at a lower significance level of 
5%. Additionally, the beta coefficient for the DELF variable is more than seven times the 
size of any of the two ESC variables. The Common legal origin variable again becomes 
insignificant at conventional levels. This performance of the DELF underlines that it 
indeed seems to be a more accurate measure of cultural proximity, in the form of common 
preferences, compared to the other applied variables. In addition, the DELF data have 
the considerable advantage that they allow researchers to expand their analyses onto a 
global scale, going beyond the small set of countries participating in the Eurovision Song 
Contest. 


T Ø G e 
Imports Imports Imports Imports 
Ln (Distance) -2.18*** -2.09*** -2.21*** -2.20*** 
(-21.76) —— (-18.30) (-25.63) (-22.08) 
Common border 0.01 -0.01 0.08 0.06 
(0.10) (-0.09) (0.87) (0.59) 
Asylum seekers 0.07** 0.08** 
(2.24) (2.43) 
Language proximity 1.18*** 0.94*** 
(4.10) (2.80) 
Book imports 0.01** 0.01 
(2.07) (1.35) 
Newspaper imports -0.01* -0.01 
(-1.73) (-1.33) 
Ln (Bilateral opinion) - lagged 0.38** 0.23 
(2.23) (1.22) 
DELF -2.03*** = -1.97*** 
(-5.24) (-4.41) 
Constant -10.09*** — -11.85*** -9.11*** -9.71*** 
(14.35) — (-10.71) (-14.98) (-8.57) 
Observations 679 585 864 TAT 
Adjusted R? 0.773 0.770 0.710 0.726 
F-Test 117.95 101.54 121.58 108.67 


Robust ¢ statistics in parentheses 
* p « 0.10, ** p « 0.05, *** p «0.01 
Imports are measured as Imports / product of GDPs 


Table 4.8: Influence of cultural affinity factors on EU imports, based on Disdier and Mayer (2007) 


A comparable result is found in Disdier and Mayer (2007). Besides the determinants 
of bilateral opinions, they also analyze trade between EU member countries and the 10 
prospective CEEC accession countries over the period 1988-2001. The aim of their study 


is to identify the role potential affinity variables between countries play on their trade 
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volume. Their OLS regressions are replicated in Table 4.8. Imports are regressed on a 
set of geographical and cultural distance variables.?48 Distance between the major cities 
again has a large, and significant influence, because most trade between the countries in 
the data set is handled on the road. Having a Common border has no impact, however. 
Comparable to Felbermayr and Toubal (2010), the share of Asylum seekers, a Language 
proximity index and Book imports are intended to proxy for bilateral affinity. Newspaper 
imports are seen as a proxy for information access (reduced transaction costs). Asylum 
seekers and Language proximity have the expected significant positive sign. Book imports 
and Newspaper imports are only significant when the opinion variable is not included. 
Except for the opinion variable, all affinity factors (share of Asylum seekers, the Language 
proximity index, Book imports) are substituted in regressions (3) and (4) by the composite 
DELF. The DELF covers all the cultural affinity factors excellently, leaving the other 
economic and distance factors unchanged. What’s more, bilateral opinion, used by Disdier 
and Mayer (2007) as their main variable of interest, becomes insignificant at conventional 
levels when the DELF is included.?49 


The results of the replication of Felbermayr and Toubal (2010) and Disdier and Mayer 
(2007), show that the DELF index indeed covers the cultural distance between two coun- 
tries extremely well, in a way that reflects its influence on preferences. These preferences, 


in turn, are one of the main reasons why cultural proximity increases trade volumes. 


4.7 Conclusion 


The DELF is constructed to overcome two limiting factors in the analysis of ethnicity in 
the economic context. It covers a rather new, but important aspect of ethnicity, that being 
its diversity. In doing so, it is not intended to render all previous indices and measures 
irrelevant, but to improve economic analysis in fields where diversity is more important 
than fractionalization and polarization. Additionally, it offers the possibility to measure 


cultural affinity between nations, which is not covered by the ELF and POL indices. 


Concerning the incidence, onset and duration of conflicts, it is obvious to assume 
that, besides the sheer number of groups (fractionalization), the differences between these 
groups also play a role. The DELF was tested on the incidence of conflict, in a replication 
of Garcia-Montalvo and Reynal-Querol (2005b). It shows a stronger significance for the 
assertion of conflict onset in Garcia-Montalvo and Reynal-Querol (2005b), than the po- 
larization index. The possible discrimination of outside groups, during and after the war, 
decisively determines the size of the potential gains in economic and political power. This 


information, included in the DELF, seems to affect the decision to start a civil conflict. 


248 A]] regressions use country and time fixed effects. 
249 Disdier and Mayer (2007) include also a non-lagged opinion. Using this alternative delivers comparable 
results. 
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For economic growth, ethnic fractionalization and diversity reveal a quite comparable 
negative effect. This effect, however, greatly disappears when a whole set of other control 
variables are included. A deeper analysis into how heterogeneity affects these channels 
and the subsequent effects on economic growth seems necessary. A new extension of the 
established analyses show that this negative effect is not universal, but depends on the 
level of development in a country. Countries with a higher level of human development 
HDI) are not negatively affected. As this effect is solely for the DELF and not for 
he ELF, those countries can apparently harvest the innovation of productivity increasing 
positive effects of ethnic diversity. 

Trust, opinions and well-being are the basis for a wide set of analyses. In determining 
heir drivers, many authors find that, besides economic, institutional and political factors, 


a range of cultural aspects are equally important. The study of Bjgrnskov (2008) on social 


rust is used to assess the DELF’s performance in this field within countries. In contrast, 
in the analysis of Disdier and Mayer (2007) the opinions between countries were the fo- 
cus of the research. For its application in the regressions on social trust, the DELF is 
not significant at conventional levels, which is no improvement upon the original setting 
using the ELF measure. Nevertheless, it is surprising that these factors do not show any 
impact and it appears that either a specific factor rendering the ethnic measures salient 
was omitted, or the appropriate aspect of ethnicity was not included in the regressions 
(Bjørnskov, 2008). Regarding the opinion of EU member states towards the new acces- 
sion countries during the Eastern enlargement, the DELF shows a significant influence 
(Disdier and Mayer, 2007). Countries that have a lower cultural distance, measured by 
the composite DELF, are more open to the accession of these countries. Therefore, the 
DELF shows some indication that it can be used as a good proxy for opinions and trust 
between countries. Its influence on trust within countries remains unsupported. 

To distinguish between the channels (transaction costs versus preferences) through 
which cultural proximity is meant to impact trade, the last section employs studies by 
Felbermayr and Toubal (2010) and Disdier and Mayer (2007). In both replications the 
DELF index reveals a significant positive effect on imports. The study of Felbermayr and 
Toubal (2010) additionally showed that this effect is more prominent for heterogeneous 
goods than for homogeneous goods. A higher cultural proximity is reflected in more 
aligned preferences, increasing the trade volume between these countries, especially for 
more differentiated goods. Overall, the DELF is a good substitute for a range of cultural 
affinity factors, without altering the regression performances. As both studies focused on 
European trade flows, its validity for global trade flows needs to be proved. In contrast to 
most of the other cultural affinity factors tested by the above articles, the DELF offers 
global coverage and is thus well suited for these extensions. 

This chapter shows the applicability of the DELF index in fields where ethnic diversity 
is meant to play an important role: conflict, growth, trust and trade. It does not render 


the ELF and POL indices irrelevant, but advocates for the additional importance of the 
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diversity aspect in many settings. The considerable advantage of the DELF data set is 
its wide coverage of countries, allowing one to expand analyses onto a global scale, thus 
going far beyond the limited scope of most recent papers. Future research is especially 
encouraged to follow this route, expanding these analyses to examine their broader external 


validity. 
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Appendix — Chapter 1 


A.1 Mathematical Appendix 


A.1.1 Partial derivatives 


As in Equation (1.6), the cost function is given through:?°° 


b(0,a;) = logg (ai) = 


For the defined range of 1 > a; >0 and 10 > 0, it follows that In(a;) € 0 and In(6) < 0. 


Subsequently, the partial derivates for the ability level a; are given through: 


1 
ai- ln(0) 


ab/a; = (-1) 


<0 


1 


a? -In(0) = 


0 influences the overall cost function for all individuals of group g who want to learn the 


language of group h. From the above definition it follows that: 


0b/00 =  l(a;):(—1):In(0)?- 


250For simplification purposes, the subscript gh of 0), is dropped, and only @ is used as a result. 
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For the sake of completeness, the second derivative for 0 is additionally given through: 


aote = n(as)-[( N. sopa 2) TU 
1 2 
= lola): mo tomo 


= ln 2) Em E aa 


BR 1 1, 2 
— a)” * T 

0 -In(6)? 0. (0 

vu e A. Nr D 

> X >0 p" 


Thus, the sign of the second derivative is dependent on the expression in brackets. For 


this expression, it holds over the defined range of 0 € | 0,...,1 | that: 


The internal solution for change in sign is given through: 


T E. 
0 In(0) 
In(0) = —20 
In(@) 4- 20 — 0 
0 = 0.426 


Finally, the second derivative of the cost function regarding 0 is defined through: 


>0 for de] 0,...,d 
00/020 = QE PE Hah 

«0 for6€]06,..,.1[ 
A graphical representation of cost functions for various levels of 0 is given in Figure A.3 
of Appendix A.2. 


A.1.2 Implication of relative group sizes for overall ELF values 


The ethno-linguistic fractionalization index (ELF) is based on a Herfindahl-Hirschman 


concentration index: 


k 
ELF-1-M p, g=1,...,k (A.1) 
g=1 
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where k is the number of groups and p, their relative group sizes. Its value moves between 
zero and one and represents the probability that two randomly selected individuals from a 
population come from different groups. A higher value thus indicates a more fragmented 
country. With the definition that all p, represent the relative group sizes of a given country, 


it also holds that: 
k 
123 py, g9=1,...,k (A.2) 
g=1 


For the case of three groups, which has mainly been used in this essay, it follows from the 


above equations that: 


ELF=1-p-p-p3 (A.3) 
1 = pı +p2 +p3 (A.4) 


Introducing Equation (A.4) into Equation (A.3) leads to: 


ELF = —2pj — 2p3 + 2pı + 2p» — 2pipe 
0 = —2p} — 2p5 + 2pı + 2p2 — 2pıpa — ELF (A.5) 


The group size of p; dependent on a given ELF level and on the relative size of pa is given 


through: 


(1— p2) + J1-2ELF + 2p» — 3p} 


p= 7 (A.6) 
In order for Equation (A.6) to deliver a solution, it must hold that: 
1-2ELF + 2p. - 3p2 > 0 (A.7) 
This, in turn, leads to the following requirement for po: 
1+vy4-6ELF 
p2 < 3 (A.8) 
Finally, for this to hold, it additionally needs to satisfy: 
4—6ELF 70 
2 
ELF< 3 (A.9) 


For the example in this essay, the above equation leads to some range limitation between 
which the respective group constellation can vary, delivering a given ELF value. For an 
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ELF value of 0.5 which satisfies Equation (A.9), the range is limited to po < 3. Figure 
A.1 shows the combinations of p; and p depending on a range of possible values for po, 


leading to an equal ELF value of 0.5: 


0.8 


pl, 
3 07 


0.6 
0.5 
0.4 


0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0.55 0.6 0.65 0.7 
—p1 --p3 


Figure A.1: Values of pa and ps for any a given value of pı, delivering a constant ELF value of 
0.5 


Philipp Kolo - 978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


Appendix A. Appendix — Chapter 1 115 


A.2 Additional figures and tables 


Gai 0.03 
0.025 
0.02 2T E 
/ Ww ae 
0.015 / s x “oy. 


0 01 02 03 04 05 06 07 08 09 1 
-—Beta(2,5) — -Beta(2.4) == Beta(3.3) +++ Beta(3.2) 


Figure A.2: Density functions for selected B(o, 8) distributions 


b(0, a i) 


0 01 02 03 04 05 06 07 08 09 | 


—Log 0.05 — -Log 02 == Log 04 ++ Log 0.6 


Figure A.3: Cost functions for selected levels of 0 
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B.1 Details of key variables 


Table B.1: Overview of variables, definitions and sources 


Variable name 


Description 


Source 


ANM 


Atlas Narodov Mira (ANM) Ethno-linguistic fraction- 
alization index (ELF) 


Roeder (2001) 


Alesina et al. 
(2003) 


Fearon (2003) 


Cepii (2011) 


Based on G-Econ 
(2006) 


World Bank (2009) 


Putterman (2008) 


Putterman and 
Weil (2010) 


Marshall and Jag- 
gers (2008) 


Sachs (2001) 


Based on Fearon 
(2003) 


Alesina Ethno-linguistic fractionalization index (ELF) of 
Alesina 

Fearon Ethno-linguistic fractionalization index (ELF) of 
Fearon 

Latitude Absolute value of the latitude of a country’s capital, 
scaled to take values between 0 and 1, where 0 is the 
equator 

Altitude Average absolute deviation of single grid and country 
mean altitudes (in 1,000m) 

Ln(Area) Log of country area in square kilometers 

Agritime Years since transition to agriculture (in 1,000 years) 
in relation to the base year 2000 A.D. 

Modern State power over territory between 1800 and 1950 in 
years” 

Democratic Average Polity 2 score (ranging from -10 to 10), with 

tradition ower values indicating less democratic, or autocratic 
(negative values) regimes for the years after WWII 
up until 1960. Only countries with observation for at 
east half of the years included 

Tropics % land area in Koeppen-Geiger tropics and subtropics 
(Af+Am+Aw+Cw) 

Regional Dummy for Eastern Europe, Latin America, North 

dummies Africa and Middle East, Sub-Saharan Africa, Western 
countries and Asia 

Colony Dummy variable that takes a value of one if a country 


was colonized and 0 if not 


Based on data in 
Olsson (2007) 


Continued on next page 
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Table B.1 — continued from previous page 


Variable name 


Description 


Source 


Duration Total number of years under colonial rule* Olsson (2007) 
Colonial origin Dummy variable for Spanish, French, British or Por-  Cepii (2011) 
dummy tuguese colonization of the country 

Ln(Mortality) Log of potential settler mortality, measured in terms Acemoglu et al. 
of deaths per annum per 1.000 ‘mean strength’ of set- (2001) 
tlers 

Ln(Urbanization) Log of % of population living in urban areas World Bank (2009) 

Immigration International migrant stock (96 of population) World Bank (2009) 

Ln(Population) Log of total population World Bank (2009) 

Polity IV Average Polity 2 score (ranging from -10 to 10), with Marshall and Jag- 
lower values indicating a less democratic, or auto- gers (2008) 
cratic regimes (negative values) 

Conflicts Years with summed magnitudes of all major events of Marshall (2006) 
political violence (MEPV) higher than 1 

Ln(Trade) Log of trade (96 of GDP) World Bank (2009) 

Ln(Telephones) Log of mobile and fixed-line telephone subscribers World Bank (2009) 
(per 100 people) 

Ln(GDP/capita) Log of real GDP per capita in constant international Heston et al. 
dollars (Laspeyres index) - Penn World Tables (2009) 

HDI Human Development Indicator, measures develop. UNDP - United 
ment along three dimensions: healthy life, GDP per Nations Develop- 
capita and education ment Programme 

(1994) 

Prim., Sec., Tert. % of population aged 15 and over that attained re- Barro and Lee 

Enrollment spective level of schooling (2010) 

Prim., Sec., Tert. % of population aged 15 and over that completed re- Barro and Lee 

Completion spective level of schooling (2010) 

Prim., Sec., Tert. Average years of respective school attainment of pop- Barro and Lee 

Schooling ulation aged 15 and over (2010) 


* For better readability in regression tables, variables were rescaled to decades. 
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'Table B.2: Summary statistics of geographic and historical variables 


Variable Obs. Mean Std. Dev. Min. Max. 
ANM '61 38 0.463 0.278 0.000 0.909 
ANM '85 68 0.461 0.272 0.000 0.984 
Fearon ELF 53 0.479 0.260 0.002 1.000 
Alesina ELF 8T 0.432 0.263 0.000 0.930 
Latitude 8T 0.280 0.185 0.002 0.668 
Altitude 76 0.256 0.292 0.000 1.767 
Ln (Area) 92 4.340 2.788 -6.215 9.747 
Agritime 61 4.769 2.473 0.000 — 10.500 
Modern 45 10.002 3.193 1.875 15.000 
Democratic 72 0.354 7.257 -10.000 10.000 
Tropics 56 0.367 0.433 0.000 1.000 
Colony 92 0.641 0.481 0.000 1.000 
Colonial duration 23 18.701 13.139 2.300 50.300 
Ln (Mortality) 63 4.678 1.238 2.146 7.986 


Table B.3: Summary statistics of change variables (1960/65-1975/80) 


Variable Obs. Mean Std. Dev. Min. Max. 
Ln (Urbanization) 191 0.287 0.259 -0.454 1.415 
Immigration 63 0.534 6.039 -13.850 54.650 
Ln (Population) 83 0.325 0.223 -0.154 . 1.895 
Primary Schooling 41 0.697 0.407 -0.150 1.860 
Secondary Schooling 41 0.597 0.443 -0.450 2.160 
Tertiary Schooling 41 0.072 0.080 -0.200 0.520 
Polity IV 12 -1.245 4.812 -15.500 17.000 
Conflict 92 1.552 4.319 0.000 21.000 
Ln (Trade) 96 0.273 0.340 -0.770 1.203 
Ln (Telephones) 10 0.830 0.485 -0.182 2.028 
Ln (GDP/cap.) 06 0.390 0.271 -0.213 1.169 
HDI 12 0.068 0.042 0.009 0.189 
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B.2 Additional regressions and robustness checks 


Table B.4: Influence of geographic and historical variables on Atlas Narodov Mira ELF scores. 
Replication for 1961 data 


m E G m 
ANM 61 ANM '61 ANM '61 ANM '61 
Latitude -0.880*** — -0.701*** -0.373 -0.713*** 
(-8.46) (-5.88) (-1.40) (-5.24) 
Altitude 0.101* 0.148** 0.149*** 0.203** 
(1.82) (2.01) (2.71) (2.56) 
Ln (Area) 0.027*** — 0.041*** 0.031** 0.037** 
(3.13) (4.05) (2.57) (2.01) 
Agritime -0.016** -0.013 -0.004 -0.007 
(-2.10) (-1.65) (-0.36) (-0.60) 
Modern -0.027*** -0.023** 
(-4.45) (-2.43) 
Tropics 0.211" 
(2.40) 
Asia 0.016 
(0.15) 
E. Europe -0.074 
(-1.33) 
L. America -0.097 
(-0.97) 
MENA 0.033 
(0.43) 
SSA 0.173* 
(1.71) 
Democratic 0.006* 
(1.69) 
Constant 0.609*** — -0.736*** 0.257 0.651*** 
(9.41) (10.77) (1.27) (3.46) 
Observations 130 114 124 66 
Adjusted R? 0.461 0.515 0.543 0.425 
F-Test 38.464 25.091 32.019 12.886 


Heteroscedasticity robust standard errors used; 
t statistics in parentheses 
* p « 0.10, ** p< 0.05, *** p< 0.01 
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Appendix C 


Appendix — Chapter 3 


C.1 Data robustness and alternative data 


C.1.1 Data robustness checks 


Although the discussion in this chapter already showed the general strength of the WCE 
data, some additional robustness check shall be applied. Two new data sets are created 
that add some noise to the original data. If all three datasets do not differ in a significant 
way, it should be reasonable to use the original data. In doing so, one accepts errors in 
the range of the noise added to the original data set. The noise data is created by altering 
the original group size p; to the new size p; with a normal distributed random variable in 


a way that: 
pi = pi: (1 +e) ‚with ec N(0;0) (C.1) 


For ø two different values are assumed; co, uses the standard deviation of the group 
distribution over all observations, and is thus equal for all countries. In contrast, 72 uses 
a country specific standard deviation. The scatter plot of Figure C.1 shows DELF values 
for both alternative data sets against the original data. 

The Spearman rank correlation is over 0.99 for both data sets and confirms their high 
congruency. For the new data based on country specific variations, some small outliers 
are identifiable. These are rather homogeneous countries with a limited number of groups 
and a clear majority group. By construction, they have a much higher probability of being 
distant from the original data. 

The granularity of the data, which is one of its major advantages, leads to a sizeable 
number of very small groups. The data quality, especially for these groups, might be 
debatable. Following Fearon (2003), a reduced data set is constructed excluding these 


very small groups.?°! Doing this reduces the number of groups from 12,432 down to 5,674. 


?51]n contrast to Fearon (2003), who limits his ELF calculation to groups greater than 196, here a lower 
threshold of 0.196 is used. 
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Figure C.1: Original DELF values against newly created random data sets 


Excluding groups would either alter the group shares of all groups, because one would need 
to rescale them, or one can alternatively create new groups that differ from all existing 
groups. Subsequently the second approach is followed. Although the groups are small, 
they represent some part of the population that seems to be different from the rest. In 
some countries, that new group corresponds to a rather sizeable one. Thus, to not account 
for them at all would be incorrect. Combining them into one group lowers the potential 
individual data inaccuracies. Analogous to the figure above, Figure C.2 compares the 


DELF values of the reduced data set against the full data. 
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Figure C.2: Original DELF values against reduced data set 


In this case, the most heterogeneous countries show an increased difference compared to 
the base data. Papua New Guinea is the most apparent outlier. Because Papua New 
Guinea has a huge number of small groups that are now combined into one group that 
differs completely from all other groups, it appears more diverse than when accounting 
for the mutual similarities of all the small groups. However, the similarity between both 


data sets is still very high. 
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C.1.2 Alternative similarity values 


The assignment of the similarity values according to the language classification is rather 
clear. Here, one can easily leverage the lexical congruency between two languages and 
transfer these similarity levels to the assigned 5;; values. When the sj; were differently 
assigned to correspond directly with the similarity levels and the values of 1, 0.85, 0.80, 
0.70, 0.50, 0.30 and 0.05 for sj; were used, the overall results show only marginal changes. 
However, for single countries, some slightly larger adjustments in their rank order accrue. 

For the ethno-racial classification, however, the congruency is more ordinal in nature. 
In the essay, the assigned sj; follow the same decreasing slope as that of the language 
classification. Nevertheless, one could also argue in favor of a linear assignment of the 
5x1 values to mirror the ordinal similarity levels. For both classifications, both similarity 


slopes are pictured in Figure C.3. 


sk ! sk ! 
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08 0.8 
0.7 0.7 
0.6 0.6 
0.5 0.5 
0.4 0.4 
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7 6$ 5452 129 We 5 uw d - X o me 
— Standard (WCE) ——Linear — Standard (WCE) — Linear 
(a) Language classification (b) Ethno-racial classification 


Figure C.3: Used similarity values $4; vs. linear similarity levels 


From the differences in the slopes, one can easily see that for both classifications, less 
distant groups are assigned higher 5,; values under the WCE method than under a linear 
assignment. For more distant groups, the opposite is the case. Countries with groups 
that speak more distant languages would exhibit lower DELF values in the WCE case 
than under a linear s;; allocation. Figure C.4 contrasts the DELF values used in the 
essay with the corresponding values calculated with a linear scale for the language and 
the ethno-racial classification. 

The impact differs between both characteristics. Whereas the Spearman rank correla- 
tion between both scales is again over 0.99 for the ethno-racial values, it is slightly less, at 
0.94, for the language classification. The countries with the highest downward adjustments 
are Papua New Guinea, Solomon Islands, Senegal, Vanuatu, Northern Mariana Islands, 
Niger, Uganda, Nigeria, Switzerland and Sierra Leone. The country with a significant 
upward adjustment is Trinidad and Tobago. Due to the high correlation values which 


remain, the results should not be significantly impacted. 
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Figure C.4: DELF values based on WCE similarity values against linear scale DELF values 
per characteristic 


Extending the discussion above, Fearon (2003) defines his measure of cultural diversity 
with: 
n a 

ra= C.2 

w=(*) (C2) 
where m are the highest number of classifications two groups may share and n the number 
they actual share. This naturally leads to linear similarity values. The parameter a € 
[0,...,1] then influences the course of the similarity value function to give it a concave 


shape.?? For the application here, this would translate into: 
Er) = Er)“ (03) 


The idea behind assigning such a function is that early divergence between two groups 
might signify more differences than small differences at a later stage. In other words, with 
a rising a, more severe differences are proportionally less important and small differences 
increase in importance. Desmet et al. (2012) assume that more severe splits (i.e., com- 
pletely different languages) are more relevant for more drastic conflicts of interest (e.g., 
incidence of civil wars). More nuanced differences (i.e., different dialects), in contrast, 
affect the transaction costs of coordination for any economic activity and are relevant, for 
example, in explaining differences in economic growth. As a consequence, the choice of a 
might depend on the problem under scrutiny. The final selection of a value for a, however, 


remains completely arbitrary. Fearon (2003) uses a value of a = 0.5, whereas Desmet et al. 


252-This is at least the range within which Fearon (2003) limits a. However, much larger values could still 
apply and for a = oo any continuous distance measure fades and the indices merge with their dichotomous 
forms. 
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2009) and Esteban et al. (2010) use a value of a = 0.05.29? Figure C.5 shows the courses 
g 


of the applied similarity values for different concavity assumptions. 
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Figure C.5: Similarity functions depending on different concavity assumptions 


For the three highest similarity levels, the course for a linear similarity function with 
a = 0.5 (Fearon, 2003) and assigned values of the WCE are quite comparable, yet somewhat 
distinct from the linear values. Thereafter, the WCE drops faster. With the assumption 
that a = 0.05, the similarity between two groups stays very high for quite a while, dropping 
steeply afterwards. The latter thus assigns rather extreme (dis)similarity values, whereas 
the other functions are more continuous. As the WCE similarity classification has an 
inbuilt non-linearity of similarity measures, assigning values of a is less important here 
than it is for Fearon (2003), Desmet et al. (2009) and Esteban et al. (2010). In addition, 
as the similarity values assigned by Barrett et al. (2001) in the WCE seem to be more 
grounded in the real difficulties between two individuals to communicate, this essay refrains 


from assigning an arbitrary value to a.??* 


?53Tndeed, Desmet et al. (2009) vary the values of a and conclude that these low levels show the best 
performance in their analysis of redistribution. 
254Nevertheless, DELF and D — POL values with the commonly used values of 0.05 and 0.5 for a, may 
be obtained from the author. 
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C.1.3 Characteristics of different ethnicity measures depending on the 


number of groups 


Figure C.6: ELF and POL values against number of groups for 210 countries based on WCE 
data 
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Figure C.7: Fitted ELF and DELF values against number of groups for all 210 countries 
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C.2 Details on similarity calculations, weighting and its im- 


plication for the interpretation of results 


C.2.1 Similarity matrix calculations 


Groups, the integral component of all ELF, POL and DELF calculations, can generally 
be defined for each single characteristic or by all three at the same time.” For all 
210 countries, he WCE data consists of 11,657 groups defined by language, 4,625 groups 
defined by culture, and 883 groups defined by religion. If the groups are defined by all 
three characteristics at the same time, more groups can emerge as characteristics might 
be combined. The definition of a group is the most granular possible, i.e., along all three 
characteristics, and results in 12,432 groups in the data set used. This means that any 
two groups differ slightly in at least one of the characteristics. 

'The following example shall illustrate the calculation of the similarity values per char- 
acteristic and the combination to arrive at the composite DELF values. The exemplary 
country consists of three groups. Thus, for the example, it follows that the number of 
K groups within this country is given by K = {A;B;C}. There exist two languages, L1 
and L2, two ethno-racial groups, El and E2, and only one religion, R1. Combining these 


characteristics results in three groups with the specifications below: 


Group | Language  Ethno-racial Religion 


A L1 El R1 
B L2 El R1 
C L2 E2 R1 


Table C.1: Specifications of characteristics per group 


For each characteristic, language, ethno-racial, and religion, similarity values (sb, SH, 
and af), with k,l € K = (A; B;C]) between two groups can be assigned. Based on these 


specifications, one can calculate a DELF value for each of the characteristics: 


DELF =1— M M peasy (C.4) 
keKleK 

DELFg —1— Y Y pipih (C.5) 
keKleK 

DELFr=1— M Y pepsi (C.6) 
keKleK 


with k,l € {A;B;C} and pj and p; the relative group sizes. To arrive at the similarity 
values, one can set up a similarity matrix for each characteristic. For the above example, 


these matrices are shown in Table C.2. 


255Naturally, one could also combine any two of the characteristics if such a combination was recommended 
for the research problem at hand. 
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A B C A B C A B Cc 

Ba Sin Sho A | st, sip Shc Alsi, sip Sie 
38A SBR SBC B | sg, She bc B | sg, She Shc 
šča éB 36c C | 364 Sep Séc C| sa Hr oc 


Table C.2: Exemplary similarity matrices for the three groups (a) with mutual language sh, 


values, (b) with mutual ethno-racial sE values and (c) with mutual religion aft values 


The assumptions that Sg, = 1 and 5x1 = Sır for all k,l € {A;B;C} hold, and for all groups 
that belong to one language or ethno-racial group, a respective similarity value of one is 
assigned. In the case of the religious classification, all belong to one religion, i.e., one group. 
Based on the characteristic definitions in Table C. 1, it follows that 34. = 5$0 = SE 4 = SE. 
The distance is labeled in the following simplified sP. This analogously holds for the 


language similarity values. The matrices of Table C.2 can be further defined with: 
(a) (b) (c) 
A B C A B C A B C 
All gh gh All 1 se A|1 1 1 
B|s^ 1 1 BI 1 gf 1 1 1 
cls 1 1 c| Æ ı C1 1 1 


Table C.3: Similarity matrices for the three groups, taking into account the specifications of their 
(a) language, (b) ethno-racial and (c) religious characteristic 


With the relative group sizes pa, pp and pc, one obtains an exemplary DE LF, index for 


the ethno-racial characteristic: 
DELFg = 1—- M M pepsi = 
keK keK 
= l-(pa:pa:lcpa:pn:lcpa:poc: 8 + 
+pp-pa-l+pe-pp-1+pp-po- 8" + 


t+pc-pa:S” t po-pp: 8 po: po-1) = 
= 1-((pa--pnY -14-2-(pa-- pz): pc- 8 pt 1) 


One can clearly see that for the single characteristics DELF, the respective most granular 
split per characteristic is decisive. The group definition at a more detailed level does not 
add additional information. In the above example, this would lead to a reduced 2 x 2 


matrix of the one found in Table C.2(b) with one group (A+B), and the remaining group 
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C with the respective relative group sizes (p4 +pp) and po.?9 However, for the composite 
DELF, combining all three characteristics into a composite similarity measure y is key. 


The general matrix for the composite DELF calculation is then given in Table C.4. 


C | sca $cB Scc 


Table C.4: Similarity matrix for composite DELF calculation 


The calculation of the $4; depends on the mode of weighting and combining the three 
characteristics. The averaging of the characteristics has important implications for the 
interpretation of the resulting DELF values.?" Extending the discussions in section 
3.5, especially their mathematical attributes, is discussed in the following. In contrast 
to the exemplary case used here to demonstrate the similarity calculation, the following 


discussions apply to the general case. 


C.2.2 Arithmetic mean 


In the case of an arithmetic mean, as discussed in section 3.5, the composite DELF value 


is calculated as: 


KK 
DELF -1— M M prpıSkı (C.7) 
k=1l=1 
with 
2 u a n E 
$a — 3 E +35 + zi (C.8) 


where E zE and E for all k,l € K are again the respective similarity values for the 
language, ethno-racial and religious classification. In the general case, K is again the total 
number of groups in the given country. For the specifications of the above example, the 
matrix in Table C.4 transforms, with k,l € K — (A; B;C), to Table C.5. 

For the arithmetic mean, there exists an identity between the calculation of the composite 
similarity value $;;, as in Equation (C.8), to arrive at the composite DELF values and 
the arithmetic mean of the single DELF indices. With Equations (C.4)-(C.6), for the 


256This is equivalent to the discussion in section 3.2. Only perfectly similar individuals are grouped 
together and groups are meant to emerge ‘endogenously’. Here, two identical groups merge into one group. 
257 Aq] approaches portrayed here share a common, implicit assumption. They all assume that a combi- 
nation follows the same pattern, independent of the specific combination of the single characteristics, and 
that the combination is equivalent in all countries. This assumption is further discussed in Appendix C.2.5. 
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A B [o 
Aa+ı+1) — lües^.1) 46®+3°41) 
iü-s2.1) $4141) 4074141) 
45®4+3°41) 3(6@ 4141) 5304141) 


Table C.5: Similarity matrix for the exemplary DELF calculation 


general case, one obtains : 


DELF = =(DELF,, + DELFE + DELFR) = 


Hr - Y i sli) + ( = >> X prpıskı 


k=1l=1 k=11=1 
1 K 
3 


KK 
L= X Y pipisi 


k=11=1 


) 


et) (Er) )- 


-|% I .pRPıSKı + > I pepsi Sk Y amt Skl 


k=1l=1 k=11=1 k=11=1 


KK 


33 057 Ski + Y nie + M Se penisi 


=1 k=11=1 k=11=1 


8E 4 af) = 


Thus, 
calculation following Equations (C.7) and (C.8), and an arithmetic mean over the single 
DELF values. 


the single indices. Besides the arguments discussed in section 3.5, this is one of the main 


in the case of the arithmetic mean, there is no difference between the DELF 
The arithmetic mean is therefore the most practical way of combining 


reasons why this approach is used. 


C.2.3 Geometric mean and partly compensating methods 


In the case of the geometric mean, there is no complementarity between the three charac- 
teristics. If two groups differ completely in one characteristic, which is quite often the case 
for religion, they are also classified to be completely different overall. For the geometric 


mean, the $;, calculation follows: 


aGeo . 
Skl 


=| 5h - shes] (C.9) 
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Although the calculation of acre is not much more difficult than the standard $5, it implies 
a further limitation. In contrast to the arithmetic mean, where one finds equality in the 
calculation of the 35; and averaging the single DELF values, this is not possible for the 


geometric mean. Relying on Equation (C.9), one obtains: 


KK 
DELFGeo = 1— Y Y paps? 
k=11=1 
K 1 
Rn ( sk 85 Shy Skt ) li 
k=11=1 
K K K i K i 
=1- > X pani sh)s 29 359/777 85) )3- S S ust ati) )3 (C.10) 
k=1l=1 k=11=1 k=11=1 


In contrast, calculating the geometric average of the single indices under the consideration 
of Equations (C.4)-(C.6) leads to: 


DEL Feo = (DELF,,- DELFg- DELFR)> 
=(DELF,)3- pum -(DELFR)3 
4 


KK KK 3 KK 
-(- EF moat) (- ES ns) iQ») (C.11) 


k=11=1 k=11=1 k=11=1 


That Equations (C.10) and (C.11) are not equivalent is straightforward to see. 


Between the geometric mean, which does not mirror the complementarity of the char- 
acteristics at all, and the arithmetic mean, which does reflect this, Branisa et al. (2009) 
suggest a third alternative. They square the components before the calculation of the 


arithmetic mean. This leads to an adjusted $;; with: 


n Lope = z 
s= [GW + GE" + EY? | (C.12) 
Analogously one obtains: 
KK 
DELFpe=1— Y X pni = 
k=1l=1 
=1- a (s [60 + GE? + ci ]) = 
k=1l=1 
1 KK KK 
= 1-4 x un GR) + Mmm GR) + SS at 85) |- 
k=1l=1 k=1l=1 k=11=1 
1 us 2 
nro SR) + S na Bi) + XT 5) jl = 
k-ilzi k=11=1 


Philipp Kolo - "978-3-653-02395-4 
Downloaded from PubFactory at 01/11/2019 11:11:06AM 
via free access 


136 Appendix C. Appendix — Chapter 3 


i K K KK KK 
E -Y Y paaa- S naa- pnl) 
ka kai l=1 k= l=1 

D N. CAAA A 
#(DELF;)? #(DELFp)? #(DELFR)? 


(C.13) 


As is the case with the geometric mean, one first needs to calculate the respective composite 
$1 values on the most granular group setting and then follow Equation (C.7) to arrive 
at the composite DELF values. Figure C.8 shows a matrix scatter plot of the different 


weighting schemes. Their high correlation is again confirmed by the scatter outline: 


FE 6 
ral F4 
DELF R 
r.2 
HO 


DELF_Geo : 2 * 
at 
: PU r.6 


Figure C.8: Scatter plots of the differently weighted DELF values 


C.2.4 Principal component analysis 


Principal component analysis (PCA) is becoming a more and more utilized approach 
to assess weights, not on theoretical grounds, but based on the data itself. Whenever 
one deals with continuous data, the PCA approach is indeed a promising one. Bossert 
et al. (2011) also use this approach to calculate the composite GELF values for different 
diversity characteristics in the US. However, they also used predominantly continuous data 
like income, for example. For categorical data, the PCA is much more difficult to apply 
(Kolenikov and Angeles, 2009). For a PCA calculation, the data need to be in a number 


format and not in categories. A possible solution for this is to turn the categories in 
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dummy variables and use them for the PCA calculation.?°° To apply this procedure, one 
would need to define fixed categories of groups, which would work against the credo of this 
essay to refrain from such an approach. Additionally, the granularity of the data would, 
in any case, yield a significant number of groups and thus subsequent dummy variables.?°? 

To bypass these problems, a more straightforward approach is used. In contrast to the 
previous weighting methods between the characteristics, the single DELF values for each 
individual characteristic are used as the starting point for the PCA. Thus, the principal 


components are calculated as linear combinations of the three single DELF values per 


country. They are combined in a way that explains the largest part of their variation. 
The first principal component explains most of the variance (6276), followed by the second 
(2796), and third (1196) principal component. The assigned loading factors can then be 
used to weight the sub-indices. The results of the PCA based on the three components 
are displayed in Table C.6. 


Components/Factors Comp.1 Comp. 2 Comp. 3 
DELF, 0.658 -0.018 -0.753 
DELFg 0.541 -0.684 0.490 
DELFg 0.523 0.730 0.441 
Eigenvalue 1.860 0.798 0.342 
Proportion of explained variance 0.620 0.266 0.114 
Cumulative explained variance 0.620 0.886 1.000 


Table C.6: Results of the principal component analysis and factor loadings for the components 
of DELF sub-indices 


The loading factors found for the components of 0.66 for DELFT, 0.54 for DELFg and 
0.52 for DELF, confirm the equal weighting rather strongly.?9? Nevertheless, two slightly 
different ways of using the loading factors can be applied in order to utilize the detailed 
information of the PCA. For both indices, only the first principal component is used as it 
explains most of the variance (Ogwang and Abdou, 2003).?9! The approaches differ in the 
way they apply the loading factors. The first uses the calculated principal components of 


each observation and follows the approach of Noorbakhsh (1998). It is calculated as: 


di ) 
DELF =1-— | = C.14 
Po (rs (C.14) 


>58 This procedure was raised by Filmer and Pritchett (2001). If the categories can be transferred into an 
ordinal scale, then there exist procedures that improve the results (Kolenikov and Angeles, 2009). This, 
however, is not the case for the detailed group information on which the DELF is build. 

?99For example, Bossert et al. (2011) only used five racial, and four unemployment categories in their 
GELF calculation. 

260 Nguefack-Tsague et al. (2011) show that PCA leads to a rather equal weighting scheme if its components 
more or less demonstrate comparable correlation values. Only when these values deviate significantly does 
PCA not deliver results near to an equal weighting. 

261 Additionally, the negative loading factors of the second principal component complicate the interpre- 
tation 
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with d and sq representing the mean and the standard deviation of all d;. d; is the distance 


vector of country i from the most diverse country and is calculated as: 
di = |Z; = Zmaz| 


where z; are the calculated principal components for each country i. 
A simpler alternative multiplies the components by the PCA loading factors, and 
divides them by their sum (Ogwang and Abdou, 2003). As the first approach is the more 


accurate one, and the results do not differ significantly, it is used here. 


C.2.5 Implications of similarity value construction and possible future 
extensions 


One problem that all outlined methods share is the loss of information. The WCE data 
stick out because of their granularity and the advantage that all groups are defined along 
the three characteristics. In constructing the composite (average) index, one loses two 
pieces of information in the case of the DELF. 

Firstly, information pertaining to the spread of groups and their mutual similarities 
is lost. This is a problem for any mean construction. Average values might emerge from 
very different base data setups. The mutual similarities might scatter only slightly around 
the mean, or be quite far apart. In the case of the DELF, one averages not only the 
group sizes, but also the similarity values. Covering the spread of similarity values is an 
important piece of information, but is hard to include in the DELF index.?6? 

'To include this information, the most straightforward statistical measure would be 
to leverage the variance of the similarity values. A more elaborate method is found in 
Nguefack-Tsague et al. (2011), who, regarding the HDI, assess whether development is 
equal across all sub-indices, or if one or the other index deviates strongly from the overall 
mean of the composite index. For this, Nguefack-Tsague et al. (2011) suggest, calculating 
a balance of development index (BODI).76* When adjusted for the DELF, it follows: 


BODI =1-15- ((DELFL - DELF} -(DELFg - DELF)? --(DELFg- DELFY) (C15) 


A BODI of one indicates that all components and the composite index are rather equal, 
whereas a BODI of zero characterizes countries where the sub-indices differ as much as 
possible from the composite index. Figure C.9 displays the DELF values versus their 
respective BODI values. 

'The most significant imbalance is for Papua New Guinea, which has a low BODI value 


due to the deviation in its language and religious diversity from the composite mean. 


26254 comparable thought was behind the introduction of the POL measure. Compared to the ELF, it 
covers other information about group size spread away from a equally sized duopoly. 
263 The acronym is adopted as it may very well stand for a ‘balance of diversity index’. 
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BODI values 


DELF values 


° BODI ® Language 
* Ethno-racial * Religion 


Figure C.9: Scatter plots of BODI and DELF values. For countries with highest deviations, 
responsible characteristics marked 


Equally imbalanced are some other small islands, where some differences in the setup of 
one characteristic have a large impact. The other most imbalanced countries are Bolivia 
and Belize (due to religion), Senegal and Mali (due to language), and Togo (due to the 
ethno-racial classification). On the other side of the coin, there are some countries that 
show remarkably equal values across all the single characteristics despite a high DELF 
overall. These are Nepal, Kazakhstan, Mauritius, and Suriname. The BODI thus analyzes 


how differently the diversity of countries is spread, depending on the single characteristics. 


The more serious problem is the second piece of information lost. By using any of the 
above methods, one does not utilize the complete granularity of the data. This is easiest 
seen in the case for the arithmetic mean. There is mathematically no difference between 
using the average per characteristic, and the calculation of the composite similarity values 
at the most granular level. This equally applies for all other methods. To use this level 
of detail, one would like to assign similarity values not only per characteristic, but also to 
take the specific combination of the characteristics into account. Thus, one would need to 
assign specific complementarity factors between the characteristics to answer the question, 
whether a Christian, German speaking, Austrian is more distant for a Muslim, English 
speaking, Brit than for a Muslim, Urdu speaking, Brit. Based on these combinations, 
their mutual similarities might be less similar than only defined by the difference in their 
languages. It is obvious that these differences might also vary between cultural areas. 


Differences in religions might affect (dis)similarities between groups more in the Middle 
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East than in Europe, whereas language differences are more important in the latter.?6* 


'The mutual distance between an Christian, English speaking, American and a Muslim, 
Punjabi speaking Pakistani might be more profound in Pakistan than in the US. This is 
certainly a very important and interesting field of research. For the time being, however, 
the data to assess these differences are not available. Thus, for now it is assumed that 
the way of combining different characteristics is independent of the specific combination 
of single characteristics, and that it is comparable in all countries. Assessing the role of 
specific characteristic combinations in different cultural areas, and subsequently taking 


them into account, is a crucial step in improving the DELF in the future. 


C.2.6 Details of similarity interpretation between countries 


Considering the DELF between countries, it is obvious to explore how a theoretical coun- 
try would look like to maximize (or minimize) this similarity measure with respect to a 
given country. Following its definition in chapter 3.6, the DELF measures the expected 
dissimilarity between two individuals randomly drawn from each country. Thus, the simi- 
larity between two countries, as measured by the DELF, results not from the comparable 
structure of their respective people but from the consideration how similar two individuals 
are when they randomly meet.?6° 

How would one expect a country j, whose group constellation (profile g) would make 
it most similar to country i, given its group profile p? Simplifying Equation (3.10) using p 
instead of p; and q instead of p;, p and q are row vectors of length K and M representing 
the respective group sizes/structures. Their elements range between zero and one and add 
up to a total of one. S is the K x M symmetric distance matrix with its elements equally 


ranging between zero and one. The DELF between two countries is then given through: 
DELF;;—-1—pSq (C.16) 


As outlined earlier the key building bloc for the DELF is the similarity vector S. If all 
its elements are zero (no group exist in both countries) the resulting DELF is equal to 
one, attributed with two countries that are completely different. This is in line with what 
one would expect for two countries whose groups do not share any characteristic.2°° For 
the case when the groups in both countries share some characteristics (and more elements 


of S are non-zero) their group profiles p and q are relevant. If the group sizes pig and pjm 


264 These group distances might even be problem specific. Some combinations might be more prone to 
conflicts (e.g., religion), whereas other combinations might be more important in the field of trade (e.g., 
language). 

265This directly follows from the general construction of the GELF (Bossert et al., 2011) and taking 
the individual as the starting point of all considerations. However, the interpretation is slightly counter- 
intuitive as one would spontaneously regard two countries i and j as being ‘similar’ if their group profiles 
are similar, i.e., if Pik © Pjm and the corresponding êkm © 1 for all k = 1,2,..., K and m=1,2,...,M. 

266Note that the respective group constellations p and q for both countries are irrelevant in this case. 
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with a corresponding similarity value of $5, > 0 are small enough both countries are still 
approximately completely different with a DELF tending to one. 

On the contrary, a DELF value of zero between two countries is attained if both 
countries consist of only one, completely similar group in both countries. For any country 
i with more than one group (K, M > 1), which is the case in all countries covered by the 
WCE data, the extreme value of zero is not attained. The more elements of the similarity 
matrix S are non-zero the lower will be the resulting DELF value. Thus, lower values 
of DELF correctly indicate country pairs where the expected dissimilarity between two 
individuals is lower. However, given two countries have the identical groups (Sm & 1), 
not the identical group constellations minimizes the DELF value. Equation (C.16) is 


minimized with respect to the group constellation of the second country q when 


K 
pSq =) and (C.17) 
k=1 


is maximized, where am is the m-th element of the vector pS. Now 
K K 
So onde € an Y dk = an (C.18) 
k=1 k=1 


where a, is the largest entry of the vector (a1,aa,...,ax), and this maximum is attained 
by setting m = 1 and qm — 0 for all m Z n. A country j would be most similar to country 
i is one in which the entire population of country j belongs to a single group, namely the 
group n, where n is the subscript of the largest entry of the vector (a1,a2, ..., ax). 
Despite the maximization result, the general interpretation of lower levels of DELF 
reflecting countries that share more groups with similar characteristics is still valid. As 
most countries have a high number of groups the result of the theoretical maximization 
process leading to a single group maximizing the similarity level between both is less 
relevant than the similarity values between those groups. However, one has to consider 
that the way the DELF measures ‘similarity’ between two countries slightly deviates from 


ones general expectation of two ‘similar’ countries. 


C.2.7 Details of population weighting for regional means 


The DELF values between countries represent the expected dissimilarity between two 
individuals randomly drawn, each from a different country. Thus, one individual is ran- 
domly drawn from country A and the other from country B, and their mutual diversity is 
then assessed. For this assessment different population sizes of the two countries do not 
matter, as only the relative group sizes determine the probabilities to be matched. This 
concept is thus only applicable for tuples. 

As soon as an expected level of diversity between more than two countries is concerned, 


for example, in the case of regional averages, a different calculation applies and population 
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size matters. The two individuals are no longer drawn randomly from each country, instead 
two individuals are randomly drawn out of the region. To be drawn from one country or 
the other depends on the relative sizes of their population in relation to the region’s overall 
population. The expected (average) diversity between any two individuals drawn is then 
easily given by the DELF value between those two countries. Mathematically, the formula 


for the regional average of region r is given through: 


RR . 
DELF, =) N DELR; 
i—-1j—-1^ 7 T7 
1 RR 
= qe 22a fy DELE (C.19) 
^ atja 


where region r consists of countries i,j € {1,...R}. Their between country DELFS are 
given by DELF;; for all i,j € {1,...,R}. n; and nj are the respective populations of 
country i and j and N, = T ni € {1,..., R} the region’s total population size. 

In contrast to the DELF formula in Equation (3.8), the sum does not need to be 
subtracted from one. In Equation (3.8), $j; is a measure of similarity, whereas the DELF 
in Equation (C.19) is already a measure of dissimilarity or diversity. 

For dynamic regions it does, however, have an important implication when new coun- 
tries join or members secede. When an additional country joins a specific region (e.g., the 
EU) it brings two different types of diversity into this region. First, it enters the new re- 
gion with its internal (rather homogeneous) diversity. Secondly, it has its external (rather 
heterogeneous) diversity towards all members of the region. Depending on population size 
differences and the two types of diversity values, the additional country can either increase 


or decrease the diversity in the region. 
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C.3 Detailed DELF data per country 
Table C.7: ELF and DELF values and ranks for 210 countries 

Country ELF Rank | DELF Rank | Delta | DELF} DELFg DELFR 
Papua New Guinea 0.982 1 0.441 36 -35 0.942 0.360 0.021 
Congo, Dem. Rep. 0.977 2 0.258 91 -89 0.545 0.208 0.021 
Solomon Islands 0.971 3 0.402 42 -39 0.845 0.349 0.013 
Cameroon 0.966 4 0.553 7 -3 0.809 0.354 0.497 
Chad 0.963 5 0.564 5 0 0.876 0.277 0.540 
Tanzania 0.962 6 0.340 60 -54 0.307 0.181 0.533 
India 0.958 lr 0.326 66 -59 0.513 0.200 0.266 
Central African Republic 0.953 8 0.437 37 -29 0.703 0.208 0.399 
Vanuatu 0.948 9 0.386 49 -40 0.740 0.388 0.030 
Cote d'Ivoire 0.943 10 0.586 3 7 0.867 0.243 0.648 
United Arab Emirates 0.939 11 0.580 4 T 0.737 0.654 0.350 
Mozambique 0.927 12 0.288 80 -68 0.278 0.102 0.485 
Liberia 0.921 13 0.553 8 5 0.774 0.307 0.578 
Singapore 0.917 14 0.501 16 -2 0.715 0.201 0.586 
Nigeria 0.917 16 0.551 g 7 0.861 0.240 0.553 
Kenya 0.917 15 0.382 51 -36 0.621 0.279 0.246 
Ghana 0.915 17 0.458 27 -10 0.740 0.147 0.488 
Zambia 0.914 18 0.127 158 -140 0.272 0.077 0.031 
Togo 0.913 19 0.484 20 -1 0.723 0.099 0.629 
Congo, Rep. 0.910 20 0.192 125 -105 0.367 0.201 0.007 
Timor-Leste 0.904 21 0.458 28 -7 0.546 0.596 0.231 
Israel 0.903 22 0.402 43 -21 0.738 0.116 0.352 
Uganda 0.901 23 0.275 85 -62 0.570 0.219 0.036 
Benin 0.885 29 0.460 26 3 0.671 0.115 0.593 
South Africa 0.898 24 0.374 52 -28 0.520 0.478 0.123 
Guinea-Bissau 0.898 25 0.521 13 12 0.814 0.201 0.548 
Madagascar 0.892 26 0.255 94 -68 0.188 0.070 0.507 
Mali 0.887 27 0.453 33 -6 0.814 0.407 0.139 

amibia 0.886 28 0.385 50 -22 0.575 0.539 0.041 
Zimbabwe 0.884 30 0.148 144 -114 0.233 0.147 0.065 
Ethiopia 0.863 34 0.453 32 2 0.721 0.127 0.512 
Philippines 0.875 31 0.281 81 -50 0.457 0.210 0.177 
Bhutan 0.869 32 0.512 14 18 0.619 0.425 0.491 
Fiji 0.868 33 0.591 2 31 0.713 0.570 0.491 
ndonesia 0.855 37 0.303 75 -38 0.501 0.140 0.269 
ran, Islamic Rep. 0.855 35 0.344 58 -23 0.536 0.483 0.014 
Burkina Faso 0.855 36 0.462 25 11 0.703 0.193 0.489 

ew Caledonia 0.855 38 0.480 21 17 0.686 0.691 0.065 
Sierra Leone 0.845 39 0.531 12 27 0.780 0.348 0.466 
Angola 0.845 40 0.116 166 -126 0.199 0.113 0.035 
Micronesia, Fed. Sts. 0.840 41 0.278 84 -43 0.580 0.229 0.026 
Malaysia 0.836 42 0.510 15 27 0.685 0.231 0.614 
Gabon 0.835 43 0.227 107 -64 0.453 0.189 0.039 
taly 0.829 44 0.122 161 -117 0.224 0.094 0.047 
Qatar 0.828 45 0.484 19 26 0.572 0.651 0.230 
Senegal 0.824 46 0.339 61 -15 0.734 0.181 0.101 
United States 0.823 AT 0.448 35 12 0.589 0.657 0.097 
Suriname 0.818 48 0.636 AT 0.657 0.660 0.592 
Lao PDR 0.816 49 0.536 11 38 0.649 0.458 0.500 
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Table C.7 — continued from previous page 


Country ELF Rank | DELF Rank | Delta | DELF, DELFg DELFR 
Niger 0.782 58 0.396 45 13 0.728 0.353 0.108 
Brunei Darussalam 0.809 50 0.480 22 28 0.679 0.143 0.620 
Malawi 0.807 51 0.138 148 -97 0.154 0.062 0.197 
Mauritius 0.807 52 0.560 6 46 0.609 0.518 0.551 
Peru 0.803 53 0.336 63 -10 0.421 0.576 0.010 
France 0.802 54 0.336 62 -8 0.453 0.355 0.202 
N. Mariana Islands 0.798 55 0.396 46 9 0.775 0.385 0.028 
Thailand 0.793 56 0.216 113 -57 0.304 0.155 0.189 
Belgium 0.782 57 0.314 69 -12 0.560 0.290 0.091 
Belize 0.779 59 0.494 18 41 0.677 0.708 0.096 
Kuwait 0.777 60 0.363 56 4 0.446 0.434 0.209 
Pakistan 0.777 61 0.243 102 -41 0.410 0.299 0.021 
Gambia, The 0.774 62 0.390 48 14 0.745 0.311 0.113 
Afghanistan 0.774 63 0.297 78 -15 0.500 0.388 0.003 
Morocco 0.770 64 0.187 128 -64 0.464 0.097 0.002 
Monaco 0.765 65 0.190 127 -62 0.296 0.228 0.045 
Oman 0.759 66 0.474 23 43 0.634 0.574 0.212 
Guinea 0.753 67 0.464 24 43 0.647 0.233 0.512 
Canada 0.751 68 0.419 40 28 0.632 0.455 0.171 
Mauritania 0.750 69 0.265 90 -21 0.412 0.378 0.004 
Bolivia 0.749 70 0.431 38 32 0.678 0.572 0.043 
Spain 0.745 TL 0.195 120 -49 0.313 0.240 0.032 
Nepal 0.744 72 0.390 AT 25 0.446 0.388 0.336 
Sudan 0.738 73 0.538 10 63 0.664 0.534 0.417 
Ecuador 0.737 74 0.307 73 1 0.282 0.627 0.013 
Latvia 0.728 75 0.250 97 -22 0.510 0.226 0.014 
Eritrea 0.721 76 0.398 44 32 0.508 0.189 0.498 
Guyana 0.707 77 0.457 29 48 0.248 0.600 0.522 
Nauru 0.705 78 0.449 34 44 0.690 0.432 0.226 
Myanmar 0.699 79 0.420 39 40 0.589 0.264 0.408 
Trinidad and Tobago 0.698 80 0.410 41 39 0.188 0.559 0.483 
Andorra 0.693 81 0.137 149 -68 0.213 0.164 0.034 
Cayman Islands 0.686 82 0.253 96 -14 0.237 0.480 0.043 
Bosnia and Herzegovina 0.686 83 0.351 57 26 0.273 0.281 0.499 
Guam 0.679 84 0.343 59 25 0.645 0.325 0.061 
Switzerland 0.677 85 0.317 68 17 0.572 0.274 0.106 
Colombia 0.677 86 0.224 109 -23 0.050 0.609 0.012 
Montenegro 0.671 87 0.223 110 -23 0.219 0.167 0.283 
Guatemala 0.668 88 0.364 55 33 0.571 0.522 0.000 
New Zealand 0.667 89 0.366 53 36 0.505 0.491 0.103 
French Polynesia 0.661 90 0.258 93 -3 0.447 0.325 0.001 
Brazil 0.660 9 0.216 114 -23 0.048 0.591 0.008 
Mexico 0.658 92 0.249 98 -6 0.168 0.575 0.005 
Equatorial Guinea 0.655 93 0.266 88 5 0.543 0.214 0.042 
Djibouti 0.644 94 0.279 83 11 0.619 0.180 0.037 
Algeria 0.635 95 0.156 139 -44 0.401 0.065 0.003 
Iraq 0.633 96 0.326 65 31 0.454 0.489 0.036 
Estonia 0.631 97 0.299 77 20 0.449 0.437 0.010 
Luxembourg 0.620 98 0.248 101 -3 0.468 0.250 0.028 
Panama 0.616 99 0.366 54 45 0.465 0.584 0.048 
Macedonia, FYR 0.613 100 0.456 30 70 0.578 0.332 0.459 
Grenada 0.611 101 0.116 165 -64 0.156 0.193 0.000 
Continued on next page 
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Table C.7 — continued from previous page 
Country ELF Rank | DELF Rank | Delta | DELF} DELFg DELFR 
Kazakhstan 0.603 102 0.499 17 85 0.513 0.487 0.498 
St. Lucia 0.600 103 0.133 154 -51 0.197 0.168 0.033 
China 0.594 104 0.234 105 -1 0.223 0.035 0.445 
Egypt, Arab Rep. 0.589 105 0.065 185 -80 0.086 0.099 0.008 
Georgia 0.586 106 0.311 7l 35 0.506 0.272 0.155 
Greenland 0.581 107 0.241 103 4 0.385 0.338 0.000 
Bahrain 0.576 108 0.455 31 77 0.548 0.522 0.296 
icaragua 0.575 109 0.301 76 33 0.371 0.524 0.008 
Bermuda 0.574 110 0.192 124 -14 0.138 0.438 0.001 
Virgin Islands (U.S.) 0.570 111 0.309 72 39 0.437 0.470 0.020 
Comoros 0.567 112 0.041 192 -80 0.057 0.025 0.042 
Mongolia 0.506 125 0.266 89 36 0.191 0.083 0.523 
Turkey 0.560 113 0.255 95 18 0.328 0.430 0.006 
Mayotte 0.545 114 0.335 64 50 0.495 0.492 0.019 
Netherlands 0.542 115 0.215 115 0 0.261 0.237 0.147 
Venezuela, RB 0.542 116 0.194 122 -6 0.059 0.484 0.040 
Kyrgyz Republic 0.539 117 0.291 79 38 0.334 0.297 0.242 
Albania 0.539 118 0.248 100 18 0.334 0.140 0.272 
Ireland 0.539 119 0.194 123 -4 0.488 0.073 0.020 
Australia 0.534 120 0.305 74 46 0.381 0.354 0.178 
Sri Lanka 0.503 126 0.312 70 56 0.440 0.060 0.437 
Bahamas, The 0.523 121 0.146 145 -24 0.220 0.215 0.002 
Germany 0.518 122 0.165 135 -13 0.242 0.156 0.096 
Tajikistan 0.510 123 0.325 67 56 0.467 0.449 0.058 
St. Vincent & the Gr. 0.508 124 0.199 117 ja 0.210 0.272 0.113 
Sweden 0.503 127 0.179 130 -3 0.255 0.207 0.074 
Chile 0.500 128 0.219 112 16 0.213 0.439 0.004 
orway 0.492 129 0.133 152 -23 0.202 0.124 0.072 
Cape Verde 0.488 130 0.270 87 43 0.446 0.364 0.000 
Liechtenstein 0.485 131 0.225 108 23 0.300 0.211 0.165 
Dominican Republic 0.481 132 0.130 156 -24 0.048 0.340 0.003 
Tuvalu 0.471 133 0.058 187 -54 0.141 0.033 0.000 
United Kingdom 0.470 134 0.176 132 2 0.244 0.183 0.101 
Bangladesh 0.341 153 0.098 172 -19 0.050 0.039 0.204 
Botswana 0.462 136 0.158 137 -1 0.175 0.137 0.162 
Tunisia 0.464 135 0.038 194 -59 0.107 0.006 0.002 
Cuba 0.449 137 0.281 82 55 0.018 0.417 0.407 
Puerto Rico 0.446 138 0.157 138 0 0.048 0.419 0.005 
Argentina 0.444 139 0.249 99 40 0.245 0.412 0.089 
Moldova 0.444 140 0.198 118 22 0.395 0.173 0.027 
Palau 0.437 141 0.258 92 49 0.401 0.373 0.000 
etherlands Antilles 0.426 142 0.200 116 26 0.337 0.233 0.029 
Saudi Arabia 0.420 143 0.197 119 24 0.263 0.243 0.086 
Libya 0.415 144 0.117 164 -20 0.172 0.139 0.039 
Ukraine 0.403 145 0.094 174 -29 0.115 0.110 0.057 
Aruba 0.399 146 0.191 126 20 0.222 0.337 0.013 
Uzbekistan 0.375 147 0.155 140 7 0.207 0.180 0.078 
Russian Federation 0.374 148 0.271 86 62 0.328 0.272 0.215 
Somalia 0.372 149 0.079 178 -29 0.147 0.063 0.026 
Jamaica 0.364 150 0.087 176 -26 0.081 0.130 0.050 
Costa Rica 0.363 151 0.136 150 1 0.083 0.308 0.018 
Bulgaria 0.337 156 0.232 106 50 0.228 0.278 0.190 
Turkmenistan 0.344 152 0.121 162 -10 0.151 0.136 0.076 
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Table C.7 — continued from previous page 


Country ELF Rank | DELF Rank | Delta | DELFL DELFg DELFR 
Syrian Arab Republic 0.340 154 0.152 141 13 0.217 0.204 0.033 
Dominica 0.337 155 0.110 169 -14 0.199 0.129 0.002 
Austria 0.332 157 0.151 142 15 0.221 0.145 0.085 
Belarus 0.329 158 0.041 193 -35 0.053 0.057 0.013 
Barbados 0.324 159 0.122 160 -1 0.107 0.236 0.024 
Jordan 0.321 160 0.057 188 -28 0.082 0.066 0.023 
Serbia 0.318 161 0.171 133 28 0.214 0.194 0.106 
Vietnam 0.309 162 0.221 111 51 0.265 0.149 0.250 
Paraguay 0.308 163 0.179 129 34 0.269 0.252 0.016 
Lesotho 0.308 164 0.034 195 -31 0.061 0.039 0.002 
American Samoa 0.307 165 0.135 151 14 0.277 0.115 0.014 
Uruguay 0.305 166 0.133 153 13 0.085 0.279 0.034 
Greece 0.304 167 0.166 134 33 0.261 0.132 0.104 
Swaziland 0.304 168 0.064 186 -18 0.098 0.078 0.016 
Lebanon 0.302 169 0.239 104 65 0.276 0.259 0.183 
Hungary 0.290 170 0.178 131 39 0.223 0.285 0.026 
Lithuania 0.284 171 0.132 155 16 0.269 0.120 0.008 
Honduras 0.270 172 0.129 157 15 0.124 0.257 0.006 
West Bank and Gaza 0.266 173 0.150 143 30 0.155 0.052 0.243 
Antigua and Barbuda 0.262 174 0.093 175 -1 0.072 0.198 0.008 
Croatia 0.248 175 0.097 173 2 0.150 0.121 0.021 
Slovak Republic 0.247 176 0.142 147 29 0.207 0.217 0.001 
Azerbaijan 0.244 IIF 0.145 146 31 0.177 0.173 0.086 
Cambodia 0.233 178 0.195 121 57 0.219 0.203 0.163 
Isle of Man 0.222 179 0.027 204 -25 0.015 0.064 0.002 
Kosovo 0.220 180 0.163 136 44 0.214 0.099 0.175 
Romania 0.216 181 0.124 159 22 0.173 0.191 0.008 
El Salvador 0.215 182 0.104 170 12 0.106 0.204 0.001 
Marshall Islands 0.210 183 0.111 168 15 0.122 0.210 0.000 
Samoa 0.210 184 0.086 177 7 0.207 0.051 0.000 
Yemen, Rep. 0.195 185 0.074 180 5 0.137 0.063 0.023 
Slovenia 0.192 186 0.054 190 -4 0.079 0.046 0.037 
Finland 0.177 187 0.101 171 16 0.146 0.142 0.015 
Cyprus 0.173 188 0.112 167 21 0.170 0.123 0.042 
Portuga 0.173 189 0.074 181 8 0.056 0.144 0.023 
Denmar 0.165 190 0.117 163 27 0.144 0.122 0.086 
San Marino 0.164 191 0.010 207 -16 0.029 0.002 0.000 
St. Kitts and Nevis 0.153 192 0.073 182 10 0.066 0.105 0.049 
Sao Tome and Principe 0.153 193 0.052 191 2 0.058 0.098 0.000 
Rwanda 0.147 194 0.032 198 -4 0.013 0.044 0.039 
Iceland 0.141 195 0.054 189 6 0.107 0.052 0.004 
Malta 0.119 196 0.073 183 13 0.110 0.108 0.001 
Seychelles 0.117 197 0.070 184 13 0.087 0.110 0.014 
Czech Republic 0.109 198 0.033 197 1 0.050 0.042 0.006 
Haiti 0.108 199 0.010 208 -9 0.008 0.021 0.001 
Poland 0.102 200 0.033 196 4 0.065 0.035 0.001 
Armenia 0.100 201 0.077 179 22 0.099 0.090 0.042 
Burundi 0.099 202 0.028 202 0 0.022 0.038 0.025 
Tonga 0.094 203 0.031 200 3 0.055 0.035 0.004 
Korea, Rep. 0.059 204 0.032 199 5 0.045 0.009 0.041 
Maldives 0.059 205 0.028 203 2 0.043 0.018 0.022 
Faeroe Islands 0.058 206 0.006 210 -4 0.010 0.009 0.000 
Continued on next page 
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Table C.7 — continued from previous page 
Country ELF Rank | DELF Rank | Delta | DELF}; DELFg DELFR 
Channel Islands 0.055 207 0.029 201 6 0.053 0.029 0.005 
Kiribati 0.050 208 0.021 205 3 0.050 0.014 0.000 
Japan 0.048 209 0.019 206 3 0.032 0.011 0.014 
Korea, Dem. Rep. 0.019 210 0.007 209 1 0.015 0.006 0.000 
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Table C.8: Country-pairs with highest mutual (dis)similarities?67 


Region Country A Region Country B DELF DELF, DELFg DELFR 
SSA Burundi SSA Rwanda 0.047 0.068 0.041 0.032 
a MENA Jordan MENA Egypt 0.072 0.118 0.083 0.015 
£ | MENA Jordan MENA Yemen. 0.081 0.155 0.065 0.023 
2 | MENA Egypt MENA Yemen 0.083 — 0.151 0.082 0.015 
= LA Antigua LA St. Kitts 0.085 0.070 0.155 0.029 
'3 | Western Iceland Western Faeroe I. 0.086 0.115 0.141 0.002 
= MENA Jordan MENA Tunisia 0.089 0.217 0.037 0.012 
< | MENA Egypt MENA Tunisia 0.0 0214 0.055 0.005 
MENA Egypt MENA Libya 0.093 0.136 0.120 0.024 
MENA Yemen MENA Tunisia 0.098 0.247 0.035 0.012 
Asia Kiribati MENA Algeria 1.000 1.000 1.000 1.000 
E Asia Korea, Rep. SSA Niger 1.000 1.000 1.000 1.000 
8 Asia Lao PDR SSA Eritrea 1.000 1.000 1.000 1.000 
9 | Asia Bhutan SSA Gabon 1.000 1.000 1.000 1.000 
E Asia Bhutan SSA Congo, Rep. 1.000 1.000 1.000 1.000 
E SSA Djibouti Asia Lao PDR 1.000 1.000 1.000 1.000 
E Asia Lao PDR MENA Tunisia 1.000 1.000 1.000 1.000 
E Asia Lao PDR SSA Mauritania 1.000 1.000 1.000 1.000 
a Asia Lao PDR MENA West Bank 1.000 1.000 1.000 1.000 
Asia Lao PDR MENA Morocco 1.000 1.000 1.000 1.000 
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D.1 Summary statistics for all replications 


Table D.1: Summary statistics for replications of Garcia-Montalvo and Reynal-Querol (2005b) 


Variable Obs. Mean Std. Dev. Min. Max. 
Conflict 1,096 0.145 0.352 0.000 1.000 
Ethnic fractionalization 1,096 0.442 0.277 0.010 0.959 
Ethnic polarization 1,096 0.516 0.248 0.017 0.982 
Religious fractionalization 1,096 0.284 0.235 0.001 0.782 
Religious polarization 1,096 0.468 0.356 0.001 1.000 
Ln (GDP/capita) 1,016 7.733 1.046 5.416 10.710 
Ln (Population) 1,092 15.390 1.951 10.638 20.908 
Primary exports 1,039 0.166 0.185 0.002 2.139 
Mountains 1,088 15.311 20.074 0.000 — 82.200 
Non contiguous 1,096 0.155 0.361 0.000 1.000 
Democracy 896 0.459 0.499 0.000 1.000 
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Table D.2: Summary statistics for replications of Schüler and Weisbrod (2010) 


Variable Obs. Mean Std. Dev. Min. Max. 
Growth 476 0.018 0.027 -0.085 0.173 
Africa 640 0.294 0.456 0.000 1.000 
Latin America 640 0.238 0.426 0.000 1.000 
Ln (GDP/cap.) 460 7.741 1.035 5.438 10.053 
(Ln (GDP/cap.))? 460 60.995 16.269 29.573 101.056 
Ln (Schooling) 399 1.518 0.588 0.039 2.576 
Assassinations 476 0.000 0.000 0.000 0.001 
Financial depth 445 0.395 0.322 0.002 2.977 
Black market premium 505 0.236 0.406 -0.064 3.181 
Fiscal surplus/GDP 413 -0.234 4.102 -83.393 0.112 
Ln (Telephones/worker) 553 1.266 0.898 -1.398 2.860 
ELF (Alesina) 584 0.439 0.274 0.000 0.930 


Table D.3: Summary statistics for replications of Bjørnskov (2008) 


Variable Obs. Mean Std. Dev. Min. Max. 
"Trust 116 25.483 13.466 | 3.400 . 68.076 
Income inequality 113 41.391 11.371 21.500 70.700 
Post communist 116 0.216 0.413 0.000 1.000 
Monarchy 116 0.164 0.372 . 0.000 1.000 
Nordic country 116 0.043 0.204 0.000 1.000 
Politial diversity 89 5.149 1.860 2.074 12.066 
Political competition ('80—05) 112 0.542 0.211 0.132 1.000 
Protestant 116 15.022 23.779 0.000 95.000 
Muslim 214 10.672 25.250 0.000 100.000 
Catholic 116 30.266 36.561 0.000 98.000 
Eastern 116 5.776 19.733 0.000 95.100 
ELF (Alesina) 116 0.397 0.240 0.002 0.930 
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Table D.4: Summary statistics for replications of Disdier and Mayer (2007) 


Variable Obs. Mean Std. Dev. Min. Max. 
Opinion 887 -0.239 0.529 -1.992 1.208 
Language proximity 1,960 0.179 0.097 0.000 0.456 
Religion proximity 1,960 0.373 0.202 0.073 0.843 
Asylum seekers 1,540 1.443 1.403 0.000 4.615 
Book imports 1,960 2.225 8.057 -4.605 16.872 
Conflict years 1,960 2.064 3.248 0.000 12.000 
UN voting 1,960 4.438 0.088 4.193 4.585 
Ln (Exports) 1,287  -24.720 1.253 -31.859  -19.884 
Ln (Imports) 1,287 -24.932 1.233 -32.797  -20.240 
GDP/cap. differences 1,400 9.695 0.505 6.844 10.402 
EU budget contribution 1,3310 -0.757 1.615 -5.826 0.641 
EC benefits 1,750 3.971 0.329 2.890 4.500 
Ln (Distance) 1,960 7.164 0.5277 5.479 8.106 
Common border 1,960 0.057 0.232 0.000 1.000 
Newspaper imports 1,960 -2.807 4.966 -4.605 16.396 


'Table D.5: Summary statistics for replications of Felbermayr and Toubal (2010) 


Variable Obs. Mean Std. Dev. Min. Max. 
Aggregate imports 10,560 19.096 2.540 7.084 25.081 
Aggregate imports (hom. goods) 7,161 16.609 2.398 6.908 23.066 
Aggregate imports (diff. goods) 7,826 18.490 2.409 8.006 24.282 
Common law 10,560 0.183 0.387 0.000 1.000 
Common language 10,560 0.061 0.240 0.000 1.000 
Religion proximity 10,560 0.210 0.248 0.001 0.854 
Ethnic ties 10,457 7.816 2.742 0.000 15.433 
Common FTA 10,560 0.308 0.462 — 0.000 1.000 
Ln (Distance) 10,560 . 7.306 0.627 4.394 . 8.565 
Common border 10,560 0.091 0.287 0.000 1.000 
ESCij 12,356 0.259 0.330 — 0.000 1.000 
ESC}; 12,356 0.259 0.330 — 0.000 1.000 
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D.2 Marginal effects of DELF 


Figure D.1: Average marginal effects of DELF dependent on HDI levels with 9096 confidence 
intervalls for regressions (4) and (5) of Table 4.4 
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